Домашнее задание №2¶
Выполнил Черкасов Борис Юрьевич, студент 416 группы ВМК МГУ имени М. В. Ломоносова
Векторные модели временных рядов: VAR, VARMAX, VECM. Нужно выбрать акции (можно как на российском, так и на американском рынке), по 3 акции из 3 секторов экономики и построить прогнозы для них на 10/20/100 торговых сессий вперед. Не забывайте про выводы при использовании каждой модели и предварительных вычислений для каждой модели.¶
Дедлайн 23 ноября 23:59
Возьмем данные с Yahoo Finance, есть данные по акциям США, России.¶
В качестве секторов экономики возьмем следующие:
Россия:
- Финансы (Банки):
SBER.ME,VTBR.ME,TCSG.ME; - Нефтегаз:
GAZP.ME,LKOH.ME,ROSN.ME; - Металлургия:
NLMK.ME,GMKN.ME,CHMF.ME.
- Финансы (Банки):
США:
- Tech:
APPL,MSFT,NVDA; - Retail:
WMT,COST,TGT; - Energy:
XOM,CVX,COP.
- Tech:
P.S. Как оказалось, для банков в России с yfinance для объединения записей, имеются данные с 2019 года до 2022, поэтому по сравнению с другими наборами данных, этот сектор будет иметь меньше записей, предположу, что 627 записей не будет критично для исследования ;)
# Требуемые библиотеки
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime
import time
import warnings
warnings.filterwarnings('ignore')
# статистика
from statsmodels.tsa.stattools import adfuller, kpss
from statsmodels.tsa.vector_ar.var_model import VAR
from statsmodels.tsa.statespace.varmax import VARMAX
from statsmodels.tsa.vector_ar.vecm import coint_johansen, VECM
from statsmodels.tools.eval_measures import rmse, mse
import yfinance as yf
Загрузка данных¶
Загрузим данные об акциях, используя тикеры, которые выше. Возьмем период с текущей даты на 10 лет в прошлое....
# Пример скачивания данных для Тинькоффа
import yfinance as yf
df = yf.download("TCSG.ME", period="5y", threads=False, progress=False)
print(df.tail())
Price Close High Low Open Volume Ticker TCSG.ME TCSG.ME TCSG.ME TCSG.ME TCSG.ME Date 2022-05-18 2402.0 2549.5 2360.0 2432.0 475508 2022-05-19 2236.5 2400.0 2205.0 2400.0 325695 2022-05-20 2236.5 2236.5 2236.5 2236.5 0 2022-05-23 2236.5 2236.5 2236.5 2236.5 0 2022-05-24 2236.5 2236.5 2236.5 2236.5 0
# save_each_ticker.py
import os
# Параметры
OUTPUT_DIR = "market_data" # папка для csv
os.makedirs(OUTPUT_DIR, exist_ok=True)
PERIOD = "10y" # можно "max" или "10y"
RETRIES = 6
PAUSE = 1 # базовая пауза для экспоненциального backoff
TICKERS = [
# Россия: финансы, нефть/газ, металлургия
"SBER.ME","VTBR.ME","TCSG.ME",
"GAZP.ME","LKOH.ME","ROSN.ME",
"NLMK.ME","GMKN.ME","CHMF.ME",
# США: tech, retail, energy
"AAPL","MSFT","NVDA",
"WMT","COST","TGT",
"XOM","CVX","COP"
]
def save_df_to_csv(df, ticker):
"""Сохраняет DataFrame в CSV с датой записи; возвращает путь."""
fname = f"{ticker.replace('/','_')}.csv"
path = os.path.join(OUTPUT_DIR, fname)
df.to_csv(path)
return path
def download_single(ticker, period=PERIOD, retries=RETRIES, pause=PAUSE):
"""
Скачивает один тикер с yfinance, с retries, threads=False и экспоненциальным backoff.
Возвращает DataFrame или None.
"""
for attempt in range(1, retries+1):
try:
df = yf.download(ticker, period=period, threads=False, progress=False)
if df is None or df.empty:
print(f"⚠ {ticker}: пустой DataFrame (попытка {attempt}/{retries})")
else:
# Заменим локальную дату на индекс DatetimeIndex, отфильтруем только Adjusted Close
if 'Adj Close' in df.columns:
# сохраняем весь df (OHLCV и Adj Close) — удобнее для анализа
print(f"✅ {ticker}: успешно загружен (attempt {attempt}) — {len(df)} строк")
else:
print(f"ℹ {ticker}: загружен, но колонки 'Adj Close' отсутствует (attempt {attempt})")
return df
except Exception as e:
print(f"❌ {ticker}: ошибка при загрузке (attempt {attempt}/{retries}) — {e}")
# backoff перед следующей попыткой
sleep_time = pause * (2 ** (attempt-1))
print(f" ...ждём {sleep_time}s и повторяем")
time.sleep(sleep_time)
# если дошли сюда — не удалось
print(f"⛔ {ticker}: НЕ удалось загрузить после {retries} попыток")
return None
def try_moex_fallback(ticker):
"""
Простейшая подсказка: для российских тикеров можно использовать MOEX ISS.
Здесь не реализован полный парсер — вместо этого сообщим, как вручную попробовать.
(Если нужно, могу дополнить скрипт полной загрузкой через MOEX API.)
"""
print(f"→ fallback: если {ticker} российский и не грузится, используйте MOEX ISS API или библиотеку 'moex'.")
# main loop: скачиваем по одному и сохраняем
failed = []
for t in TICKERS:
print(f"\n=== Скачиваем {t} ===")
out_path = os.path.join(OUTPUT_DIR, f"{t.replace('/','_')}.csv")
# если уже есть файл — пропускаем (комментируй, если нужно перезагрузить)
if os.path.exists(out_path):
print(f"Файл {out_path} уже есть — пропускаю (удалите файл, чтобы перекачать).")
continue
df = download_single(t)
if df is not None and not df.empty:
saved = save_df_to_csv(df, t)
print(f"Сохранено в {saved}")
else:
failed.append(t)
# попробуем быстрый fallback (пока только уведомление)
if t.endswith(".ME"):
try_moex_fallback(t)
# итог
print("\n----- Завершено -----")
print("Успешно: ", [t for t in TICKERS if os.path.exists(os.path.join(OUTPUT_DIR, f'{t.replace('/','_')}.csv'))])
if failed:
print("Не удалось загрузить:", failed)
else:
print("Все тикеры успешно загружены.")
=== Скачиваем SBER.ME === Файл market_data\SBER.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем VTBR.ME === Файл market_data\VTBR.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем TCSG.ME === Файл market_data\TCSG.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем GAZP.ME === Файл market_data\GAZP.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем LKOH.ME === Файл market_data\LKOH.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем ROSN.ME === Файл market_data\ROSN.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем NLMK.ME === Файл market_data\NLMK.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем GMKN.ME === Файл market_data\GMKN.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем CHMF.ME === Файл market_data\CHMF.ME.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем AAPL === Файл market_data\AAPL.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем MSFT === Файл market_data\MSFT.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем NVDA === Файл market_data\NVDA.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем WMT === Файл market_data\WMT.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем COST === Файл market_data\COST.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем TGT === Файл market_data\TGT.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем XOM === Файл market_data\XOM.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем CVX === Файл market_data\CVX.csv уже есть — пропускаю (удалите файл, чтобы перекачать). === Скачиваем COP === Файл market_data\COP.csv уже есть — пропускаю (удалите файл, чтобы перекачать). ----- Завершено ----- Успешно: ['SBER.ME', 'VTBR.ME', 'TCSG.ME', 'GAZP.ME', 'LKOH.ME', 'ROSN.ME', 'NLMK.ME', 'GMKN.ME', 'CHMF.ME', 'AAPL', 'MSFT', 'NVDA', 'WMT', 'COST', 'TGT', 'XOM', 'CVX', 'COP'] Все тикеры успешно загружены.
Теперь, для каждого сектора сформируем отдельный датасет для упрощения анализа и далее будем работать с датафреймами.
Для удобства реализуем функцию очистки, чтобы привести все данные в единый датафрейм.
# Функция для очистки и формирования итоговых датафреймов для работы
def load_and_clean_csv(path, ticker_name=None):
"""
Загружает CSV, удаляет мусорные строки, повторные хедеры,
не-даты, не-числовые строки и возвращает Series цен.
"""
df = pd.read_csv(path)
# 1. Удаляем строки, где все значения строковые (например повторный header)
df = df[~df.apply(lambda r: r.astype(str).str.contains("Open|Close|High|Low|Adj|Volume").any(), axis=1)]
# 2. Пробуем определить колонку даты
# типично она первая
if "Date" in df.columns:
df["Date"] = pd.to_datetime(df["Date"], errors="coerce")
else:
# fallback: первая колонка
df.iloc[:,0] = pd.to_datetime(df.iloc[:,0], errors="coerce")
df = df.rename(columns={df.columns[0]: "Date"})
# 3. Убираем строки, где дата не распарсилась
df = df[df["Date"].notna()]
df = df.set_index("Date")
df = df.sort_index()
# 4. Ищем колонку цены
price_col = None
preferred = ["Adj Close", "Close", "close", "adjclose"]
for col in preferred:
if col in df.columns:
price_col = col
break
if price_col is None:
# fallback: первая числовая колонка
numeric_cols = df.select_dtypes(include=[np.number]).columns
if len(numeric_cols)==0:
raise ValueError(f"{ticker_name}: не найдено числовых колонок после очистки")
price_col = numeric_cols[0]
# 5. Приводим цену к float
s = pd.to_numeric(df[price_col], errors="coerce")
# 6. Убираем NaN строки
s = s.dropna()
s.name = ticker_name if ticker_name else price_col
return s
import os
# Директория для взятия загруженных данных с yfinance
DATA_DIR = "market_data"
# Загружаем все тикеры из CSV → dict[ticker] = Series
# Используем ранее написанную функцию для очистки и формирования датафреймов
dfs = {}
for f in os.listdir(DATA_DIR):
if f.endswith(".csv"):
ticker = f.replace(".csv", "")
path = os.path.join(DATA_DIR, f)
try:
s = load_and_clean_csv(path, ticker_name=ticker)
dfs[ticker] = s
except Exception as e:
print(f"❌ Ошибка при загрузке {ticker}: {e}")
# Группы
rus_fin = ["SBER.ME","VTBR.ME","TCSG.ME"]
rus_oil = ["GAZP.ME","LKOH.ME","ROSN.ME"]
rus_met = ["NLMK.ME","GMKN.ME","CHMF.ME"]
us_tech = ["AAPL","MSFT","NVDA"]
us_retail = ["WMT","COST","TGT"]
us_energy = ["XOM","CVX","COP"]
groups = {
"rus_fin": rus_fin,
"rus_oil": rus_oil,
"rus_met": rus_met,
"us_tech": us_tech,
"us_retail": us_retail,
"us_energy": us_energy
}
# Функция для формирования датафрейма сектора
def build_sector_df(tickers):
df = pd.concat([dfs[t] for t in tickers], axis=1, join="inner")
df.columns = tickers
df = df.dropna()
return df
sector_data = {name: build_sector_df(ticks) for name, ticks in groups.items()}
# Сохраняем
for name, df in sector_data.items():
df.to_csv(f"{name}_prices.csv")
print(name, df.shape, df.index.min(), df.index.max())
rus_fin (626, 3) 2019-10-29 00:00:00 2022-05-24 00:00:00 rus_oil (1616, 3) 2015-11-20 00:00:00 2022-05-24 00:00:00 rus_met (1616, 3) 2015-11-20 00:00:00 2022-05-24 00:00:00 us_tech (2514, 3) 2015-11-23 00:00:00 2025-11-20 00:00:00 us_retail (2514, 3) 2015-11-23 00:00:00 2025-11-20 00:00:00 us_energy (2514, 3) 2015-11-23 00:00:00 2025-11-20 00:00:00
Отлично, теперь можно приступить к EDA.
EDA + тесты на стационарность¶
Для упрощения, реализуем функции, которые будут отвечать за тесты на стационарность (Дики-Фуллера и KPSS), корреляционные матрицы, heatmapы и т.д. и т.п.
Также сделаем лог-преобразования для устойчивости векторных моделей в дальнейшем...
import seaborn as sns
import scipy.stats as st
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
SECTORS = {
"rus_fin": "rus_fin_prices.csv",
"rus_oil": "rus_oil_prices.csv",
"rus_met": "rus_met_prices.csv",
"us_tech": "us_tech_prices.csv",
"us_retail": "us_retail_prices.csv",
"us_energy": "us_energy_prices.csv"
}
def load_sector(path):
df = pd.read_csv(path, index_col=0, parse_dates=True)
df = df.sort_index()
return df
def adf_test(x):
res = adfuller(x, autolag="AIC")
return res[1] # p-value
def kpss_test(x):
try:
stat, p, l, crit = kpss(x, regression="c")
return p
except:
return np.nan
# Посмотрим на график квантилей для оценки нормальности распределений, поскольку VAR предполагает нормальность остатков
def qq_plots(df_returns, sector_name):
plt.figure(figsize=(14,4))
for i, col in enumerate(df_returns.columns):
plt.subplot(1, len(df_returns.columns), i+1)
st.probplot(df_returns[col], dist="norm", plot=plt)
plt.title(col)
plt.suptitle(f"QQ-plots ({sector_name})")
plt.show()
# Графики автокорреляционных функций позволят сделать выводы о подходящей модели и ее параметрах
def acf_pacf(df_returns, sector_name):
for col in df_returns.columns:
fig, ax = plt.subplots(1,2, figsize=(12,4))
plot_acf(df_returns[col], ax=ax[0])
plot_pacf(df_returns[col], ax=ax[1])
fig.suptitle(f"{sector_name} — {col}")
plt.show()
# Посмотрим на скользящее отклонение (дисперсию), возьмем для месяца, чтобы понять как сильно различаются цены по месяцам для акций
def rolling_vol(df_returns, sector_name, window=30):
vol = df_returns.rolling(window).std() * np.sqrt(252)
vol.plot(figsize=(12,5))
plt.title(f"{sector_name} — 30-day annualized volatility")
plt.grid(True)
plt.show()
# Проверим перекрестные корреляции, то есть как взаимосвязаны лаги между собой, чтобы уловить полезные зависимости, которые могут быть полезны для VAR
def lagged_cross_corr(df_returns, sector_name, lag=1):
df_lag = df_returns.shift(lag)
corr = df_returns.corrwith(df_lag)
print(f"\nLagged correlation (lag={lag}) for {sector_name}:")
display(corr)
# Стало скучно, нарисуем кластер зависимостей)))
def cluster_market_regimes(df_returns, sector_name, k=3):
scaler = StandardScaler()
X = scaler.fit_transform(df_returns)
km = KMeans(n_clusters=k, random_state=42)
labels = km.fit_predict(X)
plt.figure(figsize=(14,4))
plt.plot(df_returns.index, labels, label="Regime")
plt.title(f"{sector_name} — Market Regimes (k-means)")
plt.grid(True)
plt.show()
# ======================================================
# EDA Function (plots + stats + tests + conclusions)
# ======================================================
def run_eda(name, df_prices):
print("="*80)
print(f"📌 EDA СЕКТОР: {name.upper()}")
print("="*80)
print("\nРазмер данных:", df_prices.shape)
print("Диапазон дат:", df_prices.index.min(), "→", df_prices.index.max())
display(df_prices.head())
# --------------------------- Prices plot
plt.figure(figsize=(14,5))
plt.plot(df_prices)
plt.title(f"Цены ({name})")
plt.grid(True)
plt.legend(df_prices.columns)
plt.show()
# --------------------------- Log Returns
df_returns = np.log(df_prices).diff().dropna()
plt.figure(figsize=(14,5))
plt.plot(df_returns)
plt.title(f"Лог-доходности ({name})")
plt.grid(True)
plt.legend(df_returns.columns)
plt.show()
# --------------------------- Distribution
df_returns.hist(bins=50, figsize=(12,6))
plt.suptitle(f"Гистограммы доходностей — {name}")
plt.show()
# --------------------------- Boxplot
plt.figure(figsize=(10,5))
sns.boxplot(data=df_returns)
plt.title(f"Boxplot доходностей ({name})")
plt.show()
# --------------------------- Correlation
plt.figure(figsize=(6,5))
sns.heatmap(df_returns.corr(), annot=True, cmap="coolwarm", vmin=-1, vmax=1)
plt.title(f"Корреляционная матрица доходностей ({name})")
plt.show()
# --------------------------- ADF & KPSS
print("\n📊 Тесты стационарности (УРОВНИ):")
adf_levels = {col: adf_test(df_prices[col]) for col in df_prices.columns}
kpss_levels = {col: kpss_test(df_prices[col]) for col in df_prices.columns}
display(pd.DataFrame({"ADF p-value": adf_levels, "KPSS p-value": kpss_levels}))
# --------------------------- ACF @ PACF
acf_pacf(df_prices, name)
# --------------------------- Q-Q plot
qq_plots(df_prices, name)
# --------------------------- Rolling volatility
rolling_vol(df_prices, name)
# --------------------------- Lagged cross correlation
lagged_cross_corr(df_prices, name)
# --------------------------- Clustering analysis
cluster_market_regimes(df_prices, name)
print("\n📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):")
adf_ret = {col: adf_test(df_returns[col]) for col in df_returns.columns}
kpss_ret = {col: kpss_test(df_returns[col]) for col in df_returns.columns}
display(pd.DataFrame({"ADF p-value": adf_ret, "KPSS p-value": kpss_ret}))
# --------------------------- Conclusions
print("\n📌 ВЫВОДЫ ДЛЯ СЕКТОРА", name.upper(), "\n")
# Уровни
print("• Уровни цен: как правило НЕ стационарны.")
print(" - ADF p-value > 0.05 → не отвергаем единичный корень.")
print(" - KPSS p-value < 0.05 → ряд НЕ стационарен.")
print(" Это НОРМА для финансовых цен → VECM на уровнях возможен.")
# Доходности
print("• Лог-доходности: обычно стационарны.")
print(" - ADF p < 0.05 → стационарны.")
print(" - KPSS p > 0.05 → стационарность подтверждается.")
print(" Это делает доходности корректным выбором для VAR/VARMAX.")
# Корреляции
corr = df_returns.corr()
avg_corr = corr.values[np.triu_indices_from(corr.values, k=1)].mean()
print(f"• Средняя корреляция внутри сектора: {avg_corr:.3f}")
if avg_corr > 0.6:
print(" ➝ Сектор движется синхронно, VAR/VECM должны обнаружить сильные связи.")
elif avg_corr < 0.2:
print(" ➝ Активы слабо связаны, VAR может быть слабым.")
else:
print(" ➝ Умеренная связь, модели должны выявить перекрестные эффекты.")
print("\nГотово.\n\n")
return df_returns
# =====================
all_returns = {}
for sector, file in SECTORS.items():
df = load_sector(file)
df_ret = run_eda(sector, df)
all_returns[sector] = df_ret
print("\n\n🎉 FULL EDA COMPLETED FOR ALL SECTORS")
================================================================================ 📌 EDA СЕКТОР: RUS_FIN ================================================================================ Размер данных: (626, 3) Диапазон дат: 2019-10-29 00:00:00 → 2022-05-24 00:00:00
| SBER.ME | VTBR.ME | TCSG.ME | |
|---|---|---|---|
| Date | |||
| 2019-10-29 | 207.504654 | 691.412476 | 1219.064453 |
| 2019-10-30 | 206.986130 | 698.477478 | 1216.265869 |
| 2019-10-31 | 202.993500 | 691.894165 | 1215.266479 |
| 2019-11-01 | 204.298462 | 692.857605 | 1213.267700 |
| 2019-11-05 | 206.139221 | 706.987732 | 1197.277466 |
📊 Тесты стационарности (УРОВНИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| SBER.ME | 0.873321 | 0.01 |
| VTBR.ME | 0.598282 | 0.01 |
| TCSG.ME | 0.702201 | 0.01 |
Lagged correlation (lag=1) for rus_fin:
SBER.ME 0.994662 VTBR.ME 0.997195 TCSG.ME 0.997232 dtype: float64
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| SBER.ME | 6.279155e-15 | 0.100000 |
| VTBR.ME | 4.694274e-14 | 0.100000 |
| TCSG.ME | 1.131758e-29 | 0.053674 |
📌 ВЫВОДЫ ДЛЯ СЕКТОРА RUS_FIN • Уровни цен: как правило НЕ стационарны. - ADF p-value > 0.05 → не отвергаем единичный корень. - KPSS p-value < 0.05 → ряд НЕ стационарен. Это НОРМА для финансовых цен → VECM на уровнях возможен. • Лог-доходности: обычно стационарны. - ADF p < 0.05 → стационарны. - KPSS p > 0.05 → стационарность подтверждается. Это делает доходности корректным выбором для VAR/VARMAX. • Средняя корреляция внутри сектора: 0.764 ➝ Сектор движется синхронно, VAR/VECM должны обнаружить сильные связи. Готово. ================================================================================ 📌 EDA СЕКТОР: RUS_OIL ================================================================================ Размер данных: (1616, 3) Диапазон дат: 2015-11-20 00:00:00 → 2022-05-24 00:00:00
| GAZP.ME | LKOH.ME | ROSN.ME | |
|---|---|---|---|
| Date | |||
| 2015-11-20 | 101.554611 | 1584.009644 | 209.611526 |
| 2015-11-23 | 104.264091 | 1654.354858 | 211.070862 |
| 2015-11-24 | 99.908340 | 1607.021851 | 204.119675 |
| 2015-11-25 | 98.769669 | 1625.730225 | 208.574554 |
| 2015-11-26 | 98.227753 | 1620.554443 | 208.036926 |
📊 Тесты стационарности (УРОВНИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| GAZP.ME | 0.719025 | 0.01 |
| LKOH.ME | 0.395897 | 0.01 |
| ROSN.ME | 0.364743 | 0.01 |
Lagged correlation (lag=1) for rus_oil:
GAZP.ME 0.998291 LKOH.ME 0.997971 ROSN.ME 0.997103 dtype: float64
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| GAZP.ME | 7.368139e-30 | 0.1 |
| LKOH.ME | 7.547072e-10 | 0.1 |
| ROSN.ME | 1.210841e-27 | 0.1 |
📌 ВЫВОДЫ ДЛЯ СЕКТОРА RUS_OIL • Уровни цен: как правило НЕ стационарны. - ADF p-value > 0.05 → не отвергаем единичный корень. - KPSS p-value < 0.05 → ряд НЕ стационарен. Это НОРМА для финансовых цен → VECM на уровнях возможен. • Лог-доходности: обычно стационарны. - ADF p < 0.05 → стационарны. - KPSS p > 0.05 → стационарность подтверждается. Это делает доходности корректным выбором для VAR/VARMAX. • Средняя корреляция внутри сектора: 0.719 ➝ Сектор движется синхронно, VAR/VECM должны обнаружить сильные связи. Готово. ================================================================================ 📌 EDA СЕКТОР: RUS_MET ================================================================================ Размер данных: (1616, 3) Диапазон дат: 2015-11-20 00:00:00 → 2022-05-24 00:00:00
| NLMK.ME | GMKN.ME | CHMF.ME | |
|---|---|---|---|
| Date | |||
| 2015-11-20 | 33.449692 | 9591.611328 | 299.263031 |
| 2015-11-23 | 33.691441 | 9434.567383 | 298.410431 |
| 2015-11-24 | 32.682610 | 9287.459961 | 289.234833 |
| 2015-11-25 | 32.961552 | 9369.958984 | 290.371674 |
| 2015-11-26 | 33.166107 | 9333.181641 | 293.132446 |
📊 Тесты стационарности (УРОВНИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| NLMK.ME | 0.662816 | 0.01 |
| GMKN.ME | 0.776743 | 0.01 |
| CHMF.ME | 0.834555 | 0.01 |
Lagged correlation (lag=1) for rus_met:
NLMK.ME 0.998954 GMKN.ME 0.998503 CHMF.ME 0.998702 dtype: float64
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| NLMK.ME | 2.130207e-26 | 0.1 |
| GMKN.ME | 0.000000e+00 | 0.1 |
| CHMF.ME | 1.231408e-16 | 0.1 |
📌 ВЫВОДЫ ДЛЯ СЕКТОРА RUS_MET • Уровни цен: как правило НЕ стационарны. - ADF p-value > 0.05 → не отвергаем единичный корень. - KPSS p-value < 0.05 → ряд НЕ стационарен. Это НОРМА для финансовых цен → VECM на уровнях возможен. • Лог-доходности: обычно стационарны. - ADF p < 0.05 → стационарны. - KPSS p > 0.05 → стационарность подтверждается. Это делает доходности корректным выбором для VAR/VARMAX. • Средняя корреляция внутри сектора: 0.511 ➝ Умеренная связь, модели должны выявить перекрестные эффекты. Готово. ================================================================================ 📌 EDA СЕКТОР: US_TECH ================================================================================ Размер данных: (2514, 3) Диапазон дат: 2015-11-23 00:00:00 → 2025-11-20 00:00:00
| AAPL | MSFT | NVDA | |
|---|---|---|---|
| Date | |||
| 2015-11-23 | 26.548964 | 47.538147 | 0.754261 |
| 2015-11-24 | 26.803743 | 47.590771 | 0.760359 |
| 2015-11-25 | 26.612099 | 47.099518 | 0.759383 |
| 2015-11-27 | 26.562492 | 47.310066 | 0.765725 |
| 2015-11-30 | 26.672977 | 47.678497 | 0.773776 |
📊 Тесты стационарности (УРОВНИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| AAPL | 0.982796 | 0.01 |
| MSFT | 0.967317 | 0.01 |
| NVDA | 0.998576 | 0.01 |
Lagged correlation (lag=1) for us_tech:
AAPL 0.999472 MSFT 0.999580 NVDA 0.999343 dtype: float64
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| AAPL | 5.121544e-29 | 0.1 |
| MSFT | 7.340334e-30 | 0.1 |
| NVDA | 2.902646e-30 | 0.1 |
📌 ВЫВОДЫ ДЛЯ СЕКТОРА US_TECH • Уровни цен: как правило НЕ стационарны. - ADF p-value > 0.05 → не отвергаем единичный корень. - KPSS p-value < 0.05 → ряд НЕ стационарен. Это НОРМА для финансовых цен → VECM на уровнях возможен. • Лог-доходности: обычно стационарны. - ADF p < 0.05 → стационарны. - KPSS p > 0.05 → стационарность подтверждается. Это делает доходности корректным выбором для VAR/VARMAX. • Средняя корреляция внутри сектора: 0.617 ➝ Сектор движется синхронно, VAR/VECM должны обнаружить сильные связи. Готово. ================================================================================ 📌 EDA СЕКТОР: US_RETAIL ================================================================================ Размер данных: (2514, 3) Диапазон дат: 2015-11-23 00:00:00 → 2025-11-20 00:00:00
| WMT | COST | TGT | |
|---|---|---|---|
| Date | |||
| 2015-11-23 | 16.636299 | 137.534073 | 53.716831 |
| 2015-11-24 | 16.542435 | 136.473801 | 54.170162 |
| 2015-11-25 | 16.630775 | 136.642120 | 54.370827 |
| 2015-11-27 | 16.534153 | 137.643402 | 54.578926 |
| 2015-11-30 | 16.244270 | 135.825851 | 53.880329 |
📊 Тесты стационарности (УРОВНИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| WMT | 0.998771 | 0.01 |
| COST | 0.983306 | 0.01 |
| TGT | 0.558303 | 0.01 |
Lagged correlation (lag=1) for us_retail:
WMT 0.999530 COST 0.999683 TGT 0.998726 dtype: float64
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| WMT | 6.741837e-30 | 0.1 |
| COST | 2.555191e-19 | 0.1 |
| TGT | 0.000000e+00 | 0.1 |
📌 ВЫВОДЫ ДЛЯ СЕКТОРА US_RETAIL • Уровни цен: как правило НЕ стационарны. - ADF p-value > 0.05 → не отвергаем единичный корень. - KPSS p-value < 0.05 → ряд НЕ стационарен. Это НОРМА для финансовых цен → VECM на уровнях возможен. • Лог-доходности: обычно стационарны. - ADF p < 0.05 → стационарны. - KPSS p > 0.05 → стационарность подтверждается. Это делает доходности корректным выбором для VAR/VARMAX. • Средняя корреляция внутри сектора: 0.463 ➝ Умеренная связь, модели должны выявить перекрестные эффекты. Готово. ================================================================================ 📌 EDA СЕКТОР: US_ENERGY ================================================================================ Размер данных: (2514, 3) Диапазон дат: 2015-11-23 00:00:00 → 2025-11-20 00:00:00
| XOM | CVX | COP | |
|---|---|---|---|
| Date | |||
| 2015-11-23 | 51.805092 | 58.679672 | 39.201073 |
| 2015-11-24 | 52.837589 | 59.553265 | 40.702011 |
| 2015-11-25 | 52.431053 | 59.240341 | 40.010406 |
| 2015-11-27 | 52.418140 | 58.914387 | 39.348213 |
| 2015-11-30 | 52.695629 | 59.533695 | 39.767601 |
📊 Тесты стационарности (УРОВНИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| XOM | 0.895126 | 0.01 |
| CVX | 0.612451 | 0.01 |
| COP | 0.655668 | 0.01 |
Lagged correlation (lag=1) for us_energy:
XOM 0.998986 CVX 0.998538 COP 0.998804 dtype: float64
📊 Тесты стационарности (ЛОГ-ДОХОДНОСТИ):
| ADF p-value | KPSS p-value | |
|---|---|---|
| XOM | 2.754644e-30 | 0.1 |
| CVX | 3.472096e-27 | 0.1 |
| COP | 1.632428e-18 | 0.1 |
📌 ВЫВОДЫ ДЛЯ СЕКТОРА US_ENERGY • Уровни цен: как правило НЕ стационарны. - ADF p-value > 0.05 → не отвергаем единичный корень. - KPSS p-value < 0.05 → ряд НЕ стационарен. Это НОРМА для финансовых цен → VECM на уровнях возможен. • Лог-доходности: обычно стационарны. - ADF p < 0.05 → стационарны. - KPSS p > 0.05 → стационарность подтверждается. Это делает доходности корректным выбором для VAR/VARMAX. • Средняя корреляция внутри сектора: 0.813 ➝ Сектор движется синхронно, VAR/VECM должны обнаружить сильные связи. Готово. 🎉 FULL EDA COMPLETED FOR ALL SECTORS
Запустим VAR¶
import os
# Создадим папки для хранения результатов
OUT_VAR = "VAR_results"
OUT_VECM = "VECM_results"
OUT_VARMAX = "VARMAX_results"
os.makedirs(OUT_VAR, exist_ok=True)
os.makedirs(OUT_VECM, exist_ok=True)
os.makedirs(OUT_VARMAX, exist_ok=True)
# VAR_pipeline_all_sectors.py
import os
import warnings
warnings.filterwarnings("ignore")
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
from statsmodels.tsa.vector_ar.var_model import VAR
from statsmodels.tools.eval_measures import rmse
from statsmodels.tsa.stattools import adfuller
from statsmodels.stats.stattools import durbin_watson
# ==========================
# Settings
# ==========================
SECTORS = {
"rus_fin": "rus_fin_prices.csv",
"rus_oil": "rus_oil_prices.csv",
"rus_met": "rus_met_prices.csv",
"us_tech": "us_tech_prices.csv",
"us_retail": "us_retail_prices.csv",
"us_energy": "us_energy_prices.csv"
}
OUT_DIR = "VAR_results"
os.makedirs(OUT_DIR, exist_ok=True)
# Forecast horizons to produce
FORECAST_HORIZONS = [10, 20, 100]
# How many last observations to use as holdout for forecast evaluation
HOLDOUT_DAYS = 100 # можно поменять
# Max lags to consider in select_order
MAX_LAGS = 10
# ==========================
# Utility functions
# ==========================
def ensure_df_numeric(df):
"""Ensure all columns numeric; coerce non-numeric to NaN then drop cols with all NaN."""
df2 = df.copy()
for c in df2.columns:
df2[c] = pd.to_numeric(df2[c], errors="coerce")
df2 = df2.dropna(axis=1, how='all')
return df2
def compute_log_returns(df):
# natural log returns
return np.log(df).diff().dropna(how='all')
def savefig(fig, path):
fig.savefig(path, bbox_inches='tight', dpi=150)
plt.close(fig)
def summarize_var_results(var_res):
"""Return dict summary of diagnostics we will store."""
res = {}
# stability: companion roots
try:
roots = var_res.roots
res['stable'] = np.all(np.abs(roots) < 1)
res['max_root_modulus'] = np.max(np.abs(roots))
except Exception:
res['stable'] = np.nan
res['max_root_modulus'] = np.nan
# serial correlation
try:
serial_test = var_res.test_serial_correlation(lags=var_res.k_ar)
res['serial_pvalue'] = serial_test.pvalue if hasattr(serial_test, 'pvalue') else np.nan
except Exception:
res['serial_pvalue'] = np.nan
# normality
try:
norm_test = var_res.test_normality()
res['normality_pvalue'] = norm_test['normality'].pvalue if hasattr(norm_test, 'normality') else np.nan
except Exception:
res['normality_pvalue'] = np.nan
# arch
try:
arch_test = var_res.test_arch()
res['arch_pvalue'] = arch_test.pvalue if hasattr(arch_test, 'pvalue') else np.nan
except Exception:
res['arch_pvalue'] = np.nan
# durbin-watson for residuals per equation (average)
try:
dw = durbin_watson(var_res.resid)
res['dw_mean'] = np.mean(dw)
except Exception:
res['dw_mean'] = np.nan
return res
def forecast_and_reconstruct_prices(var_res, df_prices, df_returns_train, steps):
"""
Forecast returns for 'steps' ahead using var_res (fitted on returns),
then reconstruct levels (prices) by cumulatively applying returns to last known prices.
df_prices: DataFrame of prices (aligned with returns)
df_returns_train: returns DataFrame that was used to fit var_res
"""
last_price = df_prices.iloc[-1] # last observed prices
# forecast returns
fc_returns = var_res.forecast(y=df_returns_train.values[-var_res.k_ar:], steps=steps)
start = df_prices.index[-1] + pd.Timedelta(days=1)
idx = pd.date_range(start=start, periods=h, freq='B')
fc_returns_df = pd.DataFrame(fc_returns, index=idx, columns=df_returns_train.columns)
# reconstruct price paths: price_t+1 = price_t * exp(return_t+1)
prices_fc = []
prev = last_price.copy()
for t in range(steps):
ret = fc_returns_df.iloc[t]
next_price = prev * np.exp(ret)
prices_fc.append(next_price)
prev = next_price
prices_fc_df = pd.DataFrame(prices_fc, index=idx, columns=df_returns_train.columns)
return fc_returns_df, prices_fc_df
# ==========================
# Main pipeline per sector
# ==========================
summary_rows = []
for sector_name, filename in SECTORS.items():
print("\n" + "="*80)
print(f"Processing sector: {sector_name}")
print("="*80)
try:
df_prices = pd.read_csv(filename, index_col=0, parse_dates=True)
except Exception as e:
print(f"ERROR: cannot read {filename}: {e}")
continue
# Clean numeric & drop NaN cols if any
df_prices = ensure_df_numeric(df_prices)
print("Loaded prices shape:", df_prices.shape)
# If fewer than 2 columns -> skip
if df_prices.shape[1] < 2:
print(f"Not enough series in {sector_name} after cleaning. Skipping.")
continue
# compute returns
df_returns = compute_log_returns(df_prices).dropna(how='any')
print("Returns shape (after dropna):", df_returns.shape)
if df_returns.shape[0] < 50:
print("Too few observations for reliable VAR. Consider expanding period. Skipping.")
continue
# train / holdout split: leave HOLDOUT_DAYS last obs for evaluation
if df_returns.shape[0] > HOLDOUT_DAYS:
train = df_returns.iloc[:-HOLDOUT_DAYS]
test = df_returns.iloc[-HOLDOUT_DAYS:]
else:
train = df_returns.copy()
test = pd.DataFrame() # empty
# Lag order selection
model = VAR(train)
try:
sel = model.select_order(MAX_LAGS)
except Exception as e:
print("select_order failed:", e)
sel = None
# choose lag by AIC (fallback to BIC if None)
if sel is not None:
try:
best_aic = sel.selected_orders.get('aic', None)
best_bic = sel.selected_orders.get('bic', None)
best_hqic = sel.selected_orders.get('hqic', None)
except Exception:
best_aic = sel.aic
best_bic = sel.bic
best_hqic = sel.hqic
else:
best_aic = best_bic = best_hqic = None
# choose lag to fit: prefer AIC, but ensure >=1
chosen_lag = int(best_aic) if (best_aic is not None and not pd.isna(best_aic) and best_aic>=1) else (int(best_bic) if best_bic is not None and best_bic>=1 else 1)
chosen_lag = max(1, chosen_lag)
print(f"Selected lags -> AIC: {best_aic}, BIC: {best_bic}, chosen: {chosen_lag}")
# Fit VAR
try:
var_res = model.fit(chosen_lag)
except Exception as e:
print("VAR fit failed:", e)
continue
# Save summary text
txt_out = os.path.join(OUT_DIR, f"{sector_name}_VAR_summary.txt")
with open(txt_out, "w", encoding="utf-8") as f:
# f.write(var_res.summary().as_text())
var_summary = var_res.summary()
try:
# старый способ (если доступен)
text = var_summary.as_text()
except:
try:
# если объект содержит таблицы (обычно tables[0], tables[1])
text = "\n\n".join([t.as_text() for t in var_summary.tables])
except:
# последний надёжный вариант
text = str(var_summary)
f.write(text)
# Diagnostics
diagnostics = summarize_var_results(var_res)
diagnostics['sector'] = sector_name
diagnostics['n_obs_train'] = train.shape[0]
diagnostics['n_vars'] = train.shape[1]
diagnostics['chosen_lag'] = chosen_lag
# Save coefficients table
coefs = var_res.params
coefs.to_csv(os.path.join(OUT_DIR, f"{sector_name}_VAR_coeffs.csv"))
# IRF
try:
irf = var_res.irf(20)
fig_irf = irf.plot(orth=False)
irf_png = os.path.join(OUT_DIR, f"{sector_name}_IRF.png")
# statsmodels returns a matplotlib Figure if plot called; handle variant
try:
savefig(fig_irf, irf_png)
except Exception:
plt.savefig(irf_png, bbox_inches='tight', dpi=150)
plt.close()
print("IRF saved:", irf_png)
except Exception as e:
print("IRF failed:", e)
# FEVD
try:
fevd = var_res.fevd(20)
# plot fevd
fig = fevd.plot()
fevd_png = os.path.join(OUT_DIR, f"{sector_name}_FEVD.png")
try:
savefig(fig, fevd_png)
except Exception:
plt.savefig(fevd_png, bbox_inches='tight', dpi=150)
plt.close()
print("FEVD saved:", fevd_png)
except Exception as e:
print("FEVD failed:", e)
# Forecasts for predefined horizons
for h in FORECAST_HORIZONS:
try:
fc_ret_df, fc_price_df = forecast_and_reconstruct_prices(var_res, df_prices.loc[train.index.union(test.index)], train, h)
fc_ret_df.to_csv(os.path.join(OUT_DIR, f"{sector_name}_forecast_returns_h{h}.csv"))
fc_price_df.to_csv(os.path.join(OUT_DIR, f"{sector_name}_forecast_prices_h{h}.csv"))
print(f"Forecasts (h={h}) saved for {sector_name}.")
except Exception as e:
print(f"Forecast failed for h={h}:", e)
# Evaluate one-step-ahead rolling forecast on holdout if available
eval_metrics = {}
if not test.empty:
try:
# one-step rolling: refit on expanding window
preds = []
true = []
train_copy = train.copy()
for t in range(len(test)):
m = VAR(train_copy)
res_t = m.fit(chosen_lag)
# forecast 1-step
yhat = res_t.forecast(y=train_copy.values[-res_t.k_ar:], steps=1)[0]
preds.append(yhat)
true.append(test.values[t])
# append true to train_copy for next step
new_row = pd.DataFrame([test.values[t]], index=[test.index[t]], columns=train_copy.columns)
train_copy = pd.concat([train_copy, new_row])
preds = np.vstack(preds)
true = np.vstack(true)
# compute RMSE and MAE per series
rmses = np.sqrt(np.mean((preds - true)**2, axis=0))
maes = np.mean(np.abs(preds - true), axis=0)
eval_df = pd.DataFrame({"ticker": train.columns, "RMSE": rmses, "MAE": maes})
eval_df.to_csv(os.path.join(OUT_DIR, f"{sector_name}_rolling_eval.csv"), index=False)
eval_metrics['rolling_RMSE_mean'] = eval_df["RMSE"].mean()
eval_metrics['rolling_MAE_mean'] = eval_df["MAE"].mean()
print("Rolling one-step evaluation saved.")
except Exception as e:
print("Rolling forecast evaluation failed:", e)
# Diagnostics tests saved
diag_df = pd.DataFrame.from_dict({sector_name: diagnostics}, orient='index')
diag_df.to_csv(os.path.join(OUT_DIR, f"{sector_name}_diagnostics.csv"))
# Save residuals diagnostics plots (ACF of residuals, histogram)
try:
resid = var_res.resid
# residuals ACF for each eqn
for col in resid.columns:
fig, ax = plt.subplots(2,1, figsize=(8,6))
sns.histplot(resid[col].dropna(), kde=True, ax=ax[0])
ax[0].set_title(f"{sector_name} residuals histogram - {col}")
from statsmodels.graphics.tsaplots import plot_acf
plot_acf(resid[col].dropna(), ax=ax[1], lags=20)
savefig(fig, os.path.join(OUT_DIR, f"{sector_name}_resid_{col}_hist_acf.png"))
print("Residual diagnostic plots saved.")
except Exception as e:
print("Residual diagnostics failed:", e)
# Append summary row
row = {
"sector": sector_name,
"n_obs": df_returns.shape[0],
"n_vars": df_returns.shape[1],
"chosen_lag": chosen_lag,
"stable": diagnostics.get('stable', np.nan),
"max_root_modulus": diagnostics.get('max_root_modulus', np.nan),
"serial_pvalue": diagnostics.get('serial_pvalue', np.nan),
"normality_pvalue": diagnostics.get('normality_pvalue', np.nan),
"arch_pvalue": diagnostics.get('arch_pvalue', np.nan),
"dw_mean": diagnostics.get('dw_mean', np.nan),
"rolling_RMSE_mean": eval_metrics.get('rolling_RMSE_mean', np.nan)
}
summary_rows.append(row)
# Save overall summary
summary_df = pd.DataFrame(summary_rows)
summary_df.to_csv(os.path.join(OUT_DIR, "VAR_overview_summary.csv"), index=False)
print("\nALL DONE. Results saved in folder:", OUT_DIR)
================================================================================ Processing sector: rus_fin ================================================================================ Loaded prices shape: (626, 3) Returns shape (after dropna): (428, 3) Selected lags -> AIC: 7, BIC: 0, chosen: 7 IRF saved: VAR_results\rus_fin_IRF.png FEVD saved: VAR_results\rus_fin_FEVD.png Forecasts (h=10) saved for rus_fin. Forecasts (h=20) saved for rus_fin. Forecasts (h=100) saved for rus_fin. Rolling one-step evaluation saved. Residual diagnostic plots saved. ================================================================================ Processing sector: rus_oil ================================================================================ Loaded prices shape: (1616, 3) Returns shape (after dropna): (1615, 3) Selected lags -> AIC: 2, BIC: 0, chosen: 2 IRF saved: VAR_results\rus_oil_IRF.png FEVD saved: VAR_results\rus_oil_FEVD.png Forecasts (h=10) saved for rus_oil. Forecasts (h=20) saved for rus_oil. Forecasts (h=100) saved for rus_oil. Rolling one-step evaluation saved. Residual diagnostic plots saved. ================================================================================ Processing sector: rus_met ================================================================================ Loaded prices shape: (1616, 3) Returns shape (after dropna): (1615, 3) Selected lags -> AIC: 1, BIC: 0, chosen: 1 IRF saved: VAR_results\rus_met_IRF.png FEVD saved: VAR_results\rus_met_FEVD.png Forecasts (h=10) saved for rus_met. Forecasts (h=20) saved for rus_met. Forecasts (h=100) saved for rus_met. Rolling one-step evaluation saved. Residual diagnostic plots saved. ================================================================================ Processing sector: us_tech ================================================================================ Loaded prices shape: (2514, 3) Returns shape (after dropna): (2513, 3) Selected lags -> AIC: 9, BIC: 1, chosen: 9 IRF saved: VAR_results\us_tech_IRF.png FEVD saved: VAR_results\us_tech_FEVD.png Forecasts (h=10) saved for us_tech. Forecasts (h=20) saved for us_tech. Forecasts (h=100) saved for us_tech. Rolling one-step evaluation saved. Residual diagnostic plots saved. ================================================================================ Processing sector: us_retail ================================================================================ Loaded prices shape: (2514, 3) Returns shape (after dropna): (2513, 3) Selected lags -> AIC: 1, BIC: 0, chosen: 1 IRF saved: VAR_results\us_retail_IRF.png FEVD saved: VAR_results\us_retail_FEVD.png Forecasts (h=10) saved for us_retail. Forecasts (h=20) saved for us_retail. Forecasts (h=100) saved for us_retail. Rolling one-step evaluation saved. Residual diagnostic plots saved. ================================================================================ Processing sector: us_energy ================================================================================ Loaded prices shape: (2514, 3) Returns shape (after dropna): (2513, 3) Selected lags -> AIC: 7, BIC: 0, chosen: 7 IRF saved: VAR_results\us_energy_IRF.png FEVD saved: VAR_results\us_energy_FEVD.png Forecasts (h=10) saved for us_energy. Forecasts (h=20) saved for us_energy. Forecasts (h=100) saved for us_energy. Rolling one-step evaluation saved. Residual diagnostic plots saved. ALL DONE. Results saved in folder: VAR_results
Визулизируем получившиеся результаты
from IPython.display import Image, display
RESULTS_DIR = "VAR_results"
def show_pngs(dir):
print("\n===== PNG визуализации =====\n")
for file in sorted(os.listdir(dir)):
if file.lower().endswith(".png"):
path = os.path.join(dir, file)
print(f"\n📌 Showing: {file}")
display(Image(filename=path))
def show_csvs(dir, n_rows=10):
print("\n===== CSV таблицы =====\n")
for file in sorted(os.listdir(dir)):
if file.lower().endswith(".csv"):
path = os.path.join(dir, file)
print(f"\n📄 CSV: {file}")
try:
df = pd.read_csv(path)
display(df.head(n_rows))
except Exception as e:
print("⚠ Error reading CSV:", e)
def show_txts(dir):
print("\n===== TXT summary =====\n")
for file in sorted(os.listdir(dir)):
if file.lower().endswith(".txt"):
path = os.path.join(dir, file)
print(f"\n📜 TXT: {file}")
try:
with open(path, "r", encoding="utf-8") as f:
print(f.read()[:2000]) # первые 2000 символов
except Exception as e:
print("⚠ Error reading TXT:", e)
print("Просмотр PNG/CSV/TXT из директории VAR_results")
show_csvs(RESULTS_DIR)
show_txts(RESULTS_DIR)
show_pngs(RESULTS_DIR)
Просмотр PNG/CSV/TXT из директории VAR_results ===== CSV таблицы ===== 📄 CSV: VAR_overview_summary.csv
| sector | n_obs | n_vars | chosen_lag | stable | max_root_modulus | serial_pvalue | normality_pvalue | arch_pvalue | dw_mean | rolling_RMSE_mean | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | rus_fin | 428 | 3 | 7 | False | 4.975734 | NaN | NaN | NaN | 1.977567 | 0.090406 |
| 1 | rus_oil | 1615 | 3 | 2 | False | 24.717950 | NaN | NaN | NaN | 1.995539 | 0.061666 |
| 2 | rus_met | 1615 | 3 | 1 | False | 21.479943 | NaN | NaN | NaN | 1.999110 | 0.038605 |
| 3 | us_tech | 2513 | 3 | 9 | False | 1.730068 | NaN | NaN | NaN | 1.996385 | 0.015975 |
| 4 | us_retail | 2513 | 3 | 1 | False | 33.784002 | NaN | NaN | NaN | 1.999501 | 0.013985 |
| 5 | us_energy | 2513 | 3 | 7 | False | 3.077592 | NaN | NaN | NaN | 1.994895 | 0.013489 |
📄 CSV: rus_fin_VAR_coeffs.csv
| Unnamed: 0 | SBER.ME | VTBR.ME | TCSG.ME | |
|---|---|---|---|---|
| 0 | const | -0.000812 | -0.001614 | 0.001677 |
| 1 | L1.SBER.ME | -0.067880 | 0.000800 | 0.026687 |
| 2 | L1.VTBR.ME | 0.050555 | 0.149044 | 0.107342 |
| 3 | L1.TCSG.ME | 0.007783 | 0.026148 | -0.050594 |
| 4 | L2.SBER.ME | -0.008981 | 0.100173 | 0.000664 |
| 5 | L2.VTBR.ME | -0.066138 | -0.028755 | -0.047856 |
| 6 | L2.TCSG.ME | 0.018146 | -0.020642 | -0.054481 |
| 7 | L3.SBER.ME | -0.172843 | -0.043765 | -0.356390 |
| 8 | L3.VTBR.ME | -0.015094 | -0.074849 | 0.217579 |
| 9 | L3.TCSG.ME | 0.177597 | 0.196131 | 0.288098 |
📄 CSV: rus_fin_diagnostics.csv
| Unnamed: 0 | stable | max_root_modulus | serial_pvalue | normality_pvalue | arch_pvalue | dw_mean | sector | n_obs_train | n_vars | chosen_lag | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | rus_fin | False | 4.975734 | NaN | NaN | NaN | 1.977567 | rus_fin | 328 | 3 | 7 |
📄 CSV: rus_fin_forecast_prices_h10.csv
| Unnamed: 0 | SBER.ME | VTBR.ME | TCSG.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | 122.936519 | 0.018983 | 2176.249157 |
| 1 | 2022-05-26 | 123.396624 | 0.018904 | 2154.390228 |
| 2 | 2022-05-27 | 124.723073 | 0.019065 | 2187.639044 |
| 3 | 2022-05-30 | 125.121019 | 0.019122 | 2217.708226 |
| 4 | 2022-05-31 | 122.456150 | 0.018862 | 2176.851007 |
| 5 | 2022-06-01 | 123.934430 | 0.018990 | 2190.451168 |
| 6 | 2022-06-02 | 124.534915 | 0.019045 | 2218.257177 |
| 7 | 2022-06-03 | 123.605252 | 0.018940 | 2216.201904 |
| 8 | 2022-06-06 | 123.132373 | 0.018857 | 2205.076383 |
| 9 | 2022-06-07 | 123.611152 | 0.018882 | 2209.227650 |
📄 CSV: rus_fin_forecast_prices_h100.csv
| Unnamed: 0 | SBER.ME | VTBR.ME | TCSG.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | 122.936519 | 0.018983 | 2176.249157 |
| 1 | 2022-05-26 | 123.396624 | 0.018904 | 2154.390228 |
| 2 | 2022-05-27 | 124.723073 | 0.019065 | 2187.639044 |
| 3 | 2022-05-30 | 125.121019 | 0.019122 | 2217.708226 |
| 4 | 2022-05-31 | 122.456150 | 0.018862 | 2176.851007 |
| 5 | 2022-06-01 | 123.934430 | 0.018990 | 2190.451168 |
| 6 | 2022-06-02 | 124.534915 | 0.019045 | 2218.257177 |
| 7 | 2022-06-03 | 123.605252 | 0.018940 | 2216.201904 |
| 8 | 2022-06-06 | 123.132373 | 0.018857 | 2205.076383 |
| 9 | 2022-06-07 | 123.611152 | 0.018882 | 2209.227650 |
📄 CSV: rus_fin_forecast_prices_h20.csv
| Unnamed: 0 | SBER.ME | VTBR.ME | TCSG.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | 122.936519 | 0.018983 | 2176.249157 |
| 1 | 2022-05-26 | 123.396624 | 0.018904 | 2154.390228 |
| 2 | 2022-05-27 | 124.723073 | 0.019065 | 2187.639044 |
| 3 | 2022-05-30 | 125.121019 | 0.019122 | 2217.708226 |
| 4 | 2022-05-31 | 122.456150 | 0.018862 | 2176.851007 |
| 5 | 2022-06-01 | 123.934430 | 0.018990 | 2190.451168 |
| 6 | 2022-06-02 | 124.534915 | 0.019045 | 2218.257177 |
| 7 | 2022-06-03 | 123.605252 | 0.018940 | 2216.201904 |
| 8 | 2022-06-06 | 123.132373 | 0.018857 | 2205.076383 |
| 9 | 2022-06-07 | 123.611152 | 0.018882 | 2209.227650 |
📄 CSV: rus_fin_forecast_returns_h10.csv
| Unnamed: 0 | SBER.ME | VTBR.ME | TCSG.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | -0.021434 | -0.009806 | -0.027309 |
| 1 | 2022-05-26 | 0.003736 | -0.004169 | -0.010095 |
| 2 | 2022-05-27 | 0.010692 | 0.008474 | 0.015315 |
| 3 | 2022-05-30 | 0.003186 | 0.003014 | 0.013651 |
| 4 | 2022-05-31 | -0.021528 | -0.013706 | -0.018595 |
| 5 | 2022-06-01 | 0.012000 | 0.006780 | 0.006228 |
| 6 | 2022-06-02 | 0.004833 | 0.002874 | 0.012614 |
| 7 | 2022-06-03 | -0.007493 | -0.005514 | -0.000927 |
| 8 | 2022-06-06 | -0.003833 | -0.004394 | -0.005033 |
| 9 | 2022-06-07 | 0.003881 | 0.001316 | 0.001881 |
📄 CSV: rus_fin_forecast_returns_h100.csv
| Unnamed: 0 | SBER.ME | VTBR.ME | TCSG.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | -0.021434 | -0.009806 | -0.027309 |
| 1 | 2022-05-26 | 0.003736 | -0.004169 | -0.010095 |
| 2 | 2022-05-27 | 0.010692 | 0.008474 | 0.015315 |
| 3 | 2022-05-30 | 0.003186 | 0.003014 | 0.013651 |
| 4 | 2022-05-31 | -0.021528 | -0.013706 | -0.018595 |
| 5 | 2022-06-01 | 0.012000 | 0.006780 | 0.006228 |
| 6 | 2022-06-02 | 0.004833 | 0.002874 | 0.012614 |
| 7 | 2022-06-03 | -0.007493 | -0.005514 | -0.000927 |
| 8 | 2022-06-06 | -0.003833 | -0.004394 | -0.005033 |
| 9 | 2022-06-07 | 0.003881 | 0.001316 | 0.001881 |
📄 CSV: rus_fin_forecast_returns_h20.csv
| Unnamed: 0 | SBER.ME | VTBR.ME | TCSG.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | -0.021434 | -0.009806 | -0.027309 |
| 1 | 2022-05-26 | 0.003736 | -0.004169 | -0.010095 |
| 2 | 2022-05-27 | 0.010692 | 0.008474 | 0.015315 |
| 3 | 2022-05-30 | 0.003186 | 0.003014 | 0.013651 |
| 4 | 2022-05-31 | -0.021528 | -0.013706 | -0.018595 |
| 5 | 2022-06-01 | 0.012000 | 0.006780 | 0.006228 |
| 6 | 2022-06-02 | 0.004833 | 0.002874 | 0.012614 |
| 7 | 2022-06-03 | -0.007493 | -0.005514 | -0.000927 |
| 8 | 2022-06-06 | -0.003833 | -0.004394 | -0.005033 |
| 9 | 2022-06-07 | 0.003881 | 0.001316 | 0.001881 |
📄 CSV: rus_fin_rolling_eval.csv
| ticker | RMSE | MAE | |
|---|---|---|---|
| 0 | SBER.ME | 0.088903 | 0.044799 |
| 1 | VTBR.ME | 0.076988 | 0.040065 |
| 2 | TCSG.ME | 0.105329 | 0.062755 |
📄 CSV: rus_met_VAR_coeffs.csv
| Unnamed: 0 | NLMK.ME | GMKN.ME | CHMF.ME | |
|---|---|---|---|---|
| 0 | const | 0.001201 | 0.000496 | 0.001027 |
| 1 | L1.NLMK.ME | -0.094152 | 0.009384 | -0.003245 |
| 2 | L1.GMKN.ME | 0.019060 | 0.039199 | 0.028179 |
| 3 | L1.CHMF.ME | 0.087085 | 0.016557 | -0.053390 |
📄 CSV: rus_met_diagnostics.csv
| Unnamed: 0 | stable | max_root_modulus | serial_pvalue | normality_pvalue | arch_pvalue | dw_mean | sector | n_obs_train | n_vars | chosen_lag | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | rus_met | False | 21.479943 | NaN | NaN | NaN | 1.99911 | rus_met | 1515 | 3 | 1 |
📄 CSV: rus_met_forecast_prices_h10.csv
| Unnamed: 0 | NLMK.ME | GMKN.ME | CHMF.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | 161.593204 | 21293.884481 | 1153.893576 |
| 1 | 2022-05-26 | 161.829379 | 21305.191851 | 1154.894727 |
| 2 | 2022-05-27 | 162.015516 | 21316.811296 | 1156.040182 |
| 3 | 2022-05-30 | 162.208414 | 21328.430669 | 1157.180743 |
| 4 | 2022-05-31 | 162.400849 | 21340.062364 | 1158.322589 |
| 5 | 2022-06-01 | 162.593581 | 21351.699833 | 1159.465580 |
| 6 | 2022-06-02 | 162.786535 | 21363.343715 | 1160.609694 |
| 7 | 2022-06-03 | 162.979719 | 21374.993940 | 1161.754938 |
| 8 | 2022-06-06 | 163.173132 | 21386.650520 | 1162.901312 |
| 9 | 2022-06-07 | 163.366774 | 21398.313456 | 1164.048818 |
📄 CSV: rus_met_forecast_prices_h100.csv
| Unnamed: 0 | NLMK.ME | GMKN.ME | CHMF.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | 161.593204 | 21293.884481 | 1153.893576 |
| 1 | 2022-05-26 | 161.829379 | 21305.191851 | 1154.894727 |
| 2 | 2022-05-27 | 162.015516 | 21316.811296 | 1156.040182 |
| 3 | 2022-05-30 | 162.208414 | 21328.430669 | 1157.180743 |
| 4 | 2022-05-31 | 162.400849 | 21340.062364 | 1158.322589 |
| 5 | 2022-06-01 | 162.593581 | 21351.699833 | 1159.465580 |
| 6 | 2022-06-02 | 162.786535 | 21363.343715 | 1160.609694 |
| 7 | 2022-06-03 | 162.979719 | 21374.993940 | 1161.754938 |
| 8 | 2022-06-06 | 163.173132 | 21386.650520 | 1162.901312 |
| 9 | 2022-06-07 | 163.366774 | 21398.313456 | 1164.048818 |
📄 CSV: rus_met_forecast_prices_h20.csv
| Unnamed: 0 | NLMK.ME | GMKN.ME | CHMF.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | 161.593204 | 21293.884481 | 1153.893576 |
| 1 | 2022-05-26 | 161.829379 | 21305.191851 | 1154.894727 |
| 2 | 2022-05-27 | 162.015516 | 21316.811296 | 1156.040182 |
| 3 | 2022-05-30 | 162.208414 | 21328.430669 | 1157.180743 |
| 4 | 2022-05-31 | 162.400849 | 21340.062364 | 1158.322589 |
| 5 | 2022-06-01 | 162.593581 | 21351.699833 | 1159.465580 |
| 6 | 2022-06-02 | 162.786535 | 21363.343715 | 1160.609694 |
| 7 | 2022-06-03 | 162.979719 | 21374.993940 | 1161.754938 |
| 8 | 2022-06-06 | 163.173132 | 21386.650520 | 1162.901312 |
| 9 | 2022-06-07 | 163.366774 | 21398.313456 | 1164.048818 |
📄 CSV: rus_met_forecast_returns_h10.csv
| Unnamed: 0 | NLMK.ME | GMKN.ME | CHMF.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | -0.000166 | -0.000287 | 0.002858 |
| 1 | 2022-05-26 | 0.001460 | 0.000531 | 0.000867 |
| 2 | 2022-05-27 | 0.001150 | 0.000545 | 0.000991 |
| 3 | 2022-05-30 | 0.001190 | 0.000545 | 0.000986 |
| 4 | 2022-05-31 | 0.001186 | 0.000545 | 0.000986 |
| 5 | 2022-06-01 | 0.001186 | 0.000545 | 0.000986 |
| 6 | 2022-06-02 | 0.001186 | 0.000545 | 0.000986 |
| 7 | 2022-06-03 | 0.001186 | 0.000545 | 0.000986 |
| 8 | 2022-06-06 | 0.001186 | 0.000545 | 0.000986 |
| 9 | 2022-06-07 | 0.001186 | 0.000545 | 0.000986 |
📄 CSV: rus_met_forecast_returns_h100.csv
| Unnamed: 0 | NLMK.ME | GMKN.ME | CHMF.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | -0.000166 | -0.000287 | 0.002858 |
| 1 | 2022-05-26 | 0.001460 | 0.000531 | 0.000867 |
| 2 | 2022-05-27 | 0.001150 | 0.000545 | 0.000991 |
| 3 | 2022-05-30 | 0.001190 | 0.000545 | 0.000986 |
| 4 | 2022-05-31 | 0.001186 | 0.000545 | 0.000986 |
| 5 | 2022-06-01 | 0.001186 | 0.000545 | 0.000986 |
| 6 | 2022-06-02 | 0.001186 | 0.000545 | 0.000986 |
| 7 | 2022-06-03 | 0.001186 | 0.000545 | 0.000986 |
| 8 | 2022-06-06 | 0.001186 | 0.000545 | 0.000986 |
| 9 | 2022-06-07 | 0.001186 | 0.000545 | 0.000986 |
📄 CSV: rus_met_forecast_returns_h20.csv
| Unnamed: 0 | NLMK.ME | GMKN.ME | CHMF.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | -0.000166 | -0.000287 | 0.002858 |
| 1 | 2022-05-26 | 0.001460 | 0.000531 | 0.000867 |
| 2 | 2022-05-27 | 0.001150 | 0.000545 | 0.000991 |
| 3 | 2022-05-30 | 0.001190 | 0.000545 | 0.000986 |
| 4 | 2022-05-31 | 0.001186 | 0.000545 | 0.000986 |
| 5 | 2022-06-01 | 0.001186 | 0.000545 | 0.000986 |
| 6 | 2022-06-02 | 0.001186 | 0.000545 | 0.000986 |
| 7 | 2022-06-03 | 0.001186 | 0.000545 | 0.000986 |
| 8 | 2022-06-06 | 0.001186 | 0.000545 | 0.000986 |
| 9 | 2022-06-07 | 0.001186 | 0.000545 | 0.000986 |
📄 CSV: rus_met_rolling_eval.csv
| ticker | RMSE | MAE | |
|---|---|---|---|
| 0 | NLMK.ME | 0.037225 | 0.020606 |
| 1 | GMKN.ME | 0.032015 | 0.017939 |
| 2 | CHMF.ME | 0.046574 | 0.025976 |
📄 CSV: rus_oil_VAR_coeffs.csv
| Unnamed: 0 | GAZP.ME | LKOH.ME | ROSN.ME | |
|---|---|---|---|---|
| 0 | const | 0.000783 | 0.000940 | 0.000618 |
| 1 | L1.GAZP.ME | 0.077234 | -0.011864 | 0.000381 |
| 2 | L1.LKOH.ME | -0.026727 | -0.040125 | -0.000113 |
| 3 | L1.ROSN.ME | 0.001376 | 0.082202 | 0.119774 |
| 4 | L2.GAZP.ME | 0.027767 | 0.046319 | -0.003363 |
| 5 | L2.LKOH.ME | -0.052866 | -0.113622 | -0.021621 |
| 6 | L2.ROSN.ME | 0.006264 | 0.014220 | -0.000542 |
📄 CSV: rus_oil_diagnostics.csv
| Unnamed: 0 | stable | max_root_modulus | serial_pvalue | normality_pvalue | arch_pvalue | dw_mean | sector | n_obs_train | n_vars | chosen_lag | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | rus_oil | False | 24.71795 | NaN | NaN | NaN | 1.995539 | rus_oil | 1515 | 3 | 2 |
📄 CSV: rus_oil_forecast_prices_h10.csv
| Unnamed: 0 | GAZP.ME | LKOH.ME | ROSN.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | 266.451624 | 4405.437944 | 398.788036 |
| 1 | 2022-05-26 | 266.793955 | 4414.768234 | 399.106213 |
| 2 | 2022-05-27 | 267.036984 | 4419.608328 | 399.412436 |
| 3 | 2022-05-30 | 267.238300 | 4423.049869 | 399.675929 |
| 4 | 2022-05-31 | 267.450351 | 4426.954081 | 399.943765 |
| 5 | 2022-06-01 | 267.665794 | 4430.965776 | 400.215263 |
| 6 | 2022-06-02 | 267.880335 | 4434.935890 | 400.486407 |
| 7 | 2022-06-03 | 268.094808 | 4438.902281 | 400.757450 |
| 8 | 2022-06-06 | 268.309559 | 4442.876534 | 401.028735 |
| 9 | 2022-06-07 | 268.524498 | 4446.854806 | 401.300227 |
📄 CSV: rus_oil_forecast_prices_h100.csv
| Unnamed: 0 | GAZP.ME | LKOH.ME | ROSN.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | 266.451624 | 4405.437944 | 398.788036 |
| 1 | 2022-05-26 | 266.793955 | 4414.768234 | 399.106213 |
| 2 | 2022-05-27 | 267.036984 | 4419.608328 | 399.412436 |
| 3 | 2022-05-30 | 267.238300 | 4423.049869 | 399.675929 |
| 4 | 2022-05-31 | 267.450351 | 4426.954081 | 399.943765 |
| 5 | 2022-06-01 | 267.665794 | 4430.965776 | 400.215263 |
| 6 | 2022-06-02 | 267.880335 | 4434.935890 | 400.486407 |
| 7 | 2022-06-03 | 268.094808 | 4438.902281 | 400.757450 |
| 8 | 2022-06-06 | 268.309559 | 4442.876534 | 401.028735 |
| 9 | 2022-06-07 | 268.524498 | 4446.854806 | 401.300227 |
📄 CSV: rus_oil_forecast_prices_h20.csv
| Unnamed: 0 | GAZP.ME | LKOH.ME | ROSN.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | 266.451624 | 4405.437944 | 398.788036 |
| 1 | 2022-05-26 | 266.793955 | 4414.768234 | 399.106213 |
| 2 | 2022-05-27 | 267.036984 | 4419.608328 | 399.412436 |
| 3 | 2022-05-30 | 267.238300 | 4423.049869 | 399.675929 |
| 4 | 2022-05-31 | 267.450351 | 4426.954081 | 399.943765 |
| 5 | 2022-06-01 | 267.665794 | 4430.965776 | 400.215263 |
| 6 | 2022-06-02 | 267.880335 | 4434.935890 | 400.486407 |
| 7 | 2022-06-03 | 268.094808 | 4438.902281 | 400.757450 |
| 8 | 2022-06-06 | 268.309559 | 4442.876534 | 401.028735 |
| 9 | 2022-06-07 | 268.524498 | 4446.854806 | 401.300227 |
📄 CSV: rus_oil_forecast_returns_h10.csv
| Unnamed: 0 | GAZP.ME | LKOH.ME | ROSN.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | -0.000857 | -0.002281 | -0.002034 |
| 1 | 2022-05-26 | 0.001284 | 0.002116 | 0.000798 |
| 2 | 2022-05-27 | 0.000911 | 0.001096 | 0.000767 |
| 3 | 2022-05-30 | 0.000754 | 0.000778 | 0.000659 |
| 4 | 2022-05-31 | 0.000793 | 0.000882 | 0.000670 |
| 5 | 2022-06-01 | 0.000805 | 0.000906 | 0.000679 |
| 6 | 2022-06-02 | 0.000801 | 0.000896 | 0.000677 |
| 7 | 2022-06-03 | 0.000800 | 0.000894 | 0.000677 |
| 8 | 2022-06-06 | 0.000801 | 0.000895 | 0.000677 |
| 9 | 2022-06-07 | 0.000801 | 0.000895 | 0.000677 |
📄 CSV: rus_oil_forecast_returns_h100.csv
| Unnamed: 0 | GAZP.ME | LKOH.ME | ROSN.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | -0.000857 | -0.002281 | -0.002034 |
| 1 | 2022-05-26 | 0.001284 | 0.002116 | 0.000798 |
| 2 | 2022-05-27 | 0.000911 | 0.001096 | 0.000767 |
| 3 | 2022-05-30 | 0.000754 | 0.000778 | 0.000659 |
| 4 | 2022-05-31 | 0.000793 | 0.000882 | 0.000670 |
| 5 | 2022-06-01 | 0.000805 | 0.000906 | 0.000679 |
| 6 | 2022-06-02 | 0.000801 | 0.000896 | 0.000677 |
| 7 | 2022-06-03 | 0.000800 | 0.000894 | 0.000677 |
| 8 | 2022-06-06 | 0.000801 | 0.000895 | 0.000677 |
| 9 | 2022-06-07 | 0.000801 | 0.000895 | 0.000677 |
📄 CSV: rus_oil_forecast_returns_h20.csv
| Unnamed: 0 | GAZP.ME | LKOH.ME | ROSN.ME | |
|---|---|---|---|---|
| 0 | 2022-05-25 | -0.000857 | -0.002281 | -0.002034 |
| 1 | 2022-05-26 | 0.001284 | 0.002116 | 0.000798 |
| 2 | 2022-05-27 | 0.000911 | 0.001096 | 0.000767 |
| 3 | 2022-05-30 | 0.000754 | 0.000778 | 0.000659 |
| 4 | 2022-05-31 | 0.000793 | 0.000882 | 0.000670 |
| 5 | 2022-06-01 | 0.000805 | 0.000906 | 0.000679 |
| 6 | 2022-06-02 | 0.000801 | 0.000896 | 0.000677 |
| 7 | 2022-06-03 | 0.000800 | 0.000894 | 0.000677 |
| 8 | 2022-06-06 | 0.000801 | 0.000895 | 0.000677 |
| 9 | 2022-06-07 | 0.000801 | 0.000895 | 0.000677 |
📄 CSV: rus_oil_rolling_eval.csv
| ticker | RMSE | MAE | |
|---|---|---|---|
| 0 | GAZP.ME | 0.059383 | 0.030505 |
| 1 | LKOH.ME | 0.059952 | 0.028453 |
| 2 | ROSN.ME | 0.065664 | 0.028774 |
📄 CSV: us_energy_VAR_coeffs.csv
| Unnamed: 0 | XOM | CVX | COP | |
|---|---|---|---|---|
| 0 | const | 0.000329 | 0.000374 | 0.000383 |
| 1 | L1.XOM | 0.112250 | 0.097129 | 0.166683 |
| 2 | L1.CVX | -0.135727 | -0.165454 | -0.251606 |
| 3 | L1.COP | -0.012350 | 0.006618 | 0.011816 |
| 4 | L2.XOM | -0.095013 | -0.115642 | -0.027888 |
| 5 | L2.CVX | 0.072186 | 0.109733 | 0.037841 |
| 6 | L2.COP | 0.039296 | 0.040802 | 0.018680 |
| 7 | L3.XOM | -0.097785 | -0.016026 | -0.072616 |
| 8 | L3.CVX | -0.017596 | -0.061729 | -0.068044 |
| 9 | L3.COP | 0.069671 | 0.053246 | 0.105017 |
📄 CSV: us_energy_diagnostics.csv
| Unnamed: 0 | stable | max_root_modulus | serial_pvalue | normality_pvalue | arch_pvalue | dw_mean | sector | n_obs_train | n_vars | chosen_lag | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | us_energy | False | 3.077592 | NaN | NaN | NaN | 1.994895 | us_energy | 2413 | 3 | 7 |
📄 CSV: us_energy_forecast_prices_h10.csv
| Unnamed: 0 | XOM | CVX | COP | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 117.061793 | 150.224186 | 87.295047 |
| 1 | 2025-11-24 | 117.437936 | 150.395922 | 87.395788 |
| 2 | 2025-11-25 | 117.317412 | 150.139710 | 87.265534 |
| 3 | 2025-11-26 | 117.386376 | 150.384276 | 87.415440 |
| 4 | 2025-11-27 | 117.360757 | 150.412089 | 87.407909 |
| 5 | 2025-11-28 | 117.382439 | 150.373188 | 87.376548 |
| 6 | 2025-12-01 | 117.625827 | 150.906205 | 87.667384 |
| 7 | 2025-12-02 | 117.612495 | 150.851548 | 87.625205 |
| 8 | 2025-12-03 | 117.664964 | 150.969592 | 87.677646 |
| 9 | 2025-12-04 | 117.664591 | 150.937111 | 87.669042 |
📄 CSV: us_energy_forecast_prices_h100.csv
| Unnamed: 0 | XOM | CVX | COP | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 117.061793 | 150.224186 | 87.295047 |
| 1 | 2025-11-24 | 117.437936 | 150.395922 | 87.395788 |
| 2 | 2025-11-25 | 117.317412 | 150.139710 | 87.265534 |
| 3 | 2025-11-26 | 117.386376 | 150.384276 | 87.415440 |
| 4 | 2025-11-27 | 117.360757 | 150.412089 | 87.407909 |
| 5 | 2025-11-28 | 117.382439 | 150.373188 | 87.376548 |
| 6 | 2025-12-01 | 117.625827 | 150.906205 | 87.667384 |
| 7 | 2025-12-02 | 117.612495 | 150.851548 | 87.625205 |
| 8 | 2025-12-03 | 117.664964 | 150.969592 | 87.677646 |
| 9 | 2025-12-04 | 117.664591 | 150.937111 | 87.669042 |
📄 CSV: us_energy_forecast_prices_h20.csv
| Unnamed: 0 | XOM | CVX | COP | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 117.061793 | 150.224186 | 87.295047 |
| 1 | 2025-11-24 | 117.437936 | 150.395922 | 87.395788 |
| 2 | 2025-11-25 | 117.317412 | 150.139710 | 87.265534 |
| 3 | 2025-11-26 | 117.386376 | 150.384276 | 87.415440 |
| 4 | 2025-11-27 | 117.360757 | 150.412089 | 87.407909 |
| 5 | 2025-11-28 | 117.382439 | 150.373188 | 87.376548 |
| 6 | 2025-12-01 | 117.625827 | 150.906205 | 87.667384 |
| 7 | 2025-12-02 | 117.612495 | 150.851548 | 87.625205 |
| 8 | 2025-12-03 | 117.664964 | 150.969592 | 87.677646 |
| 9 | 2025-12-04 | 117.664591 | 150.937111 | 87.669042 |
📄 CSV: us_energy_forecast_returns_h10.csv
| Unnamed: 0 | XOM | CVX | COP | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 0.000357 | -0.000571 | -0.002002 |
| 1 | 2025-11-24 | 0.003208 | 0.001143 | 0.001153 |
| 2 | 2025-11-25 | -0.001027 | -0.001705 | -0.001492 |
| 3 | 2025-11-26 | 0.000588 | 0.001628 | 0.001716 |
| 4 | 2025-11-27 | -0.000218 | 0.000185 | -0.000086 |
| 5 | 2025-11-28 | 0.000185 | -0.000259 | -0.000359 |
| 6 | 2025-12-01 | 0.002071 | 0.003538 | 0.003323 |
| 7 | 2025-12-02 | -0.000113 | -0.000362 | -0.000481 |
| 8 | 2025-12-03 | 0.000446 | 0.000782 | 0.000598 |
| 9 | 2025-12-04 | -0.000003 | -0.000215 | -0.000098 |
📄 CSV: us_energy_forecast_returns_h100.csv
| Unnamed: 0 | XOM | CVX | COP | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 0.000357 | -0.000571 | -0.002002 |
| 1 | 2025-11-24 | 0.003208 | 0.001143 | 0.001153 |
| 2 | 2025-11-25 | -0.001027 | -0.001705 | -0.001492 |
| 3 | 2025-11-26 | 0.000588 | 0.001628 | 0.001716 |
| 4 | 2025-11-27 | -0.000218 | 0.000185 | -0.000086 |
| 5 | 2025-11-28 | 0.000185 | -0.000259 | -0.000359 |
| 6 | 2025-12-01 | 0.002071 | 0.003538 | 0.003323 |
| 7 | 2025-12-02 | -0.000113 | -0.000362 | -0.000481 |
| 8 | 2025-12-03 | 0.000446 | 0.000782 | 0.000598 |
| 9 | 2025-12-04 | -0.000003 | -0.000215 | -0.000098 |
📄 CSV: us_energy_forecast_returns_h20.csv
| Unnamed: 0 | XOM | CVX | COP | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 0.000357 | -0.000571 | -0.002002 |
| 1 | 2025-11-24 | 0.003208 | 0.001143 | 0.001153 |
| 2 | 2025-11-25 | -0.001027 | -0.001705 | -0.001492 |
| 3 | 2025-11-26 | 0.000588 | 0.001628 | 0.001716 |
| 4 | 2025-11-27 | -0.000218 | 0.000185 | -0.000086 |
| 5 | 2025-11-28 | 0.000185 | -0.000259 | -0.000359 |
| 6 | 2025-12-01 | 0.002071 | 0.003538 | 0.003323 |
| 7 | 2025-12-02 | -0.000113 | -0.000362 | -0.000481 |
| 8 | 2025-12-03 | 0.000446 | 0.000782 | 0.000598 |
| 9 | 2025-12-04 | -0.000003 | -0.000215 | -0.000098 |
📄 CSV: us_energy_rolling_eval.csv
| ticker | RMSE | MAE | |
|---|---|---|---|
| 0 | XOM | 0.011902 | 0.009574 |
| 1 | CVX | 0.012480 | 0.009807 |
| 2 | COP | 0.016086 | 0.012614 |
📄 CSV: us_retail_VAR_coeffs.csv
| Unnamed: 0 | WMT | COST | TGT | |
|---|---|---|---|---|
| 0 | const | 0.000772 | 0.000845 | 0.000274 |
| 1 | L1.WMT | -0.040524 | 0.000503 | 0.014336 |
| 2 | L1.COST | 0.004596 | -0.032054 | -0.024830 |
| 3 | L1.TGT | -0.034155 | -0.002398 | -0.018063 |
📄 CSV: us_retail_diagnostics.csv
| Unnamed: 0 | stable | max_root_modulus | serial_pvalue | normality_pvalue | arch_pvalue | dw_mean | sector | n_obs_train | n_vars | chosen_lag | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | us_retail | False | 33.784002 | NaN | NaN | NaN | 1.999501 | us_retail | 2413 | 3 | 1 |
📄 CSV: us_retail_forecast_prices_h10.csv
| Unnamed: 0 | WMT | COST | TGT | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 106.982471 | 894.052621 | 83.639320 |
| 1 | 2025-11-24 | 107.072446 | 894.784533 | 83.659800 |
| 2 | 2025-11-25 | 107.150979 | 895.517396 | 83.681690 |
| 3 | 2025-11-26 | 107.229977 | 896.250763 | 83.703431 |
| 4 | 2025-11-27 | 107.309022 | 896.984739 | 83.725185 |
| 5 | 2025-11-28 | 107.388126 | 897.719316 | 83.746945 |
| 6 | 2025-12-01 | 107.467289 | 898.454495 | 83.768710 |
| 7 | 2025-12-02 | 107.546509 | 899.190275 | 83.790481 |
| 8 | 2025-12-03 | 107.625788 | 899.926659 | 83.812257 |
| 9 | 2025-12-04 | 107.705126 | 900.663645 | 83.834039 |
📄 CSV: us_retail_forecast_prices_h100.csv
| Unnamed: 0 | WMT | COST | TGT | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 106.982471 | 894.052621 | 83.639320 |
| 1 | 2025-11-24 | 107.072446 | 894.784533 | 83.659800 |
| 2 | 2025-11-25 | 107.150979 | 895.517396 | 83.681690 |
| 3 | 2025-11-26 | 107.229977 | 896.250763 | 83.703431 |
| 4 | 2025-11-27 | 107.309022 | 896.984739 | 83.725185 |
| 5 | 2025-11-28 | 107.388126 | 897.719316 | 83.746945 |
| 6 | 2025-12-01 | 107.467289 | 898.454495 | 83.768710 |
| 7 | 2025-12-02 | 107.546509 | 899.190275 | 83.790481 |
| 8 | 2025-12-03 | 107.625788 | 899.926659 | 83.812257 |
| 9 | 2025-12-04 | 107.705126 | 900.663645 | 83.834039 |
📄 CSV: us_retail_forecast_prices_h20.csv
| Unnamed: 0 | WMT | COST | TGT | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 106.982471 | 894.052621 | 83.639320 |
| 1 | 2025-11-24 | 107.072446 | 894.784533 | 83.659800 |
| 2 | 2025-11-25 | 107.150979 | 895.517396 | 83.681690 |
| 3 | 2025-11-26 | 107.229977 | 896.250763 | 83.703431 |
| 4 | 2025-11-27 | 107.309022 | 896.984739 | 83.725185 |
| 5 | 2025-11-28 | 107.388126 | 897.719316 | 83.746945 |
| 6 | 2025-12-01 | 107.467289 | 898.454495 | 83.768710 |
| 7 | 2025-12-02 | 107.546509 | 899.190275 | 83.790481 |
| 8 | 2025-12-03 | 107.625788 | 899.926659 | 83.812257 |
| 9 | 2025-12-04 | 107.705126 | 900.663645 | 83.834039 |
📄 CSV: us_retail_forecast_returns_h10.csv
| Unnamed: 0 | WMT | COST | TGT | |
|---|---|---|---|---|
| 0 | 2025-11-21 | -0.001191 | 0.000853 | -0.000486 |
| 1 | 2025-11-24 | 0.000841 | 0.000818 | 0.000245 |
| 2 | 2025-11-25 | 0.000733 | 0.000819 | 0.000262 |
| 3 | 2025-11-26 | 0.000737 | 0.000819 | 0.000260 |
| 4 | 2025-11-27 | 0.000737 | 0.000819 | 0.000260 |
| 5 | 2025-11-28 | 0.000737 | 0.000819 | 0.000260 |
| 6 | 2025-12-01 | 0.000737 | 0.000819 | 0.000260 |
| 7 | 2025-12-02 | 0.000737 | 0.000819 | 0.000260 |
| 8 | 2025-12-03 | 0.000737 | 0.000819 | 0.000260 |
| 9 | 2025-12-04 | 0.000737 | 0.000819 | 0.000260 |
📄 CSV: us_retail_forecast_returns_h100.csv
| Unnamed: 0 | WMT | COST | TGT | |
|---|---|---|---|---|
| 0 | 2025-11-21 | -0.001191 | 0.000853 | -0.000486 |
| 1 | 2025-11-24 | 0.000841 | 0.000818 | 0.000245 |
| 2 | 2025-11-25 | 0.000733 | 0.000819 | 0.000262 |
| 3 | 2025-11-26 | 0.000737 | 0.000819 | 0.000260 |
| 4 | 2025-11-27 | 0.000737 | 0.000819 | 0.000260 |
| 5 | 2025-11-28 | 0.000737 | 0.000819 | 0.000260 |
| 6 | 2025-12-01 | 0.000737 | 0.000819 | 0.000260 |
| 7 | 2025-12-02 | 0.000737 | 0.000819 | 0.000260 |
| 8 | 2025-12-03 | 0.000737 | 0.000819 | 0.000260 |
| 9 | 2025-12-04 | 0.000737 | 0.000819 | 0.000260 |
📄 CSV: us_retail_forecast_returns_h20.csv
| Unnamed: 0 | WMT | COST | TGT | |
|---|---|---|---|---|
| 0 | 2025-11-21 | -0.001191 | 0.000853 | -0.000486 |
| 1 | 2025-11-24 | 0.000841 | 0.000818 | 0.000245 |
| 2 | 2025-11-25 | 0.000733 | 0.000819 | 0.000262 |
| 3 | 2025-11-26 | 0.000737 | 0.000819 | 0.000260 |
| 4 | 2025-11-27 | 0.000737 | 0.000819 | 0.000260 |
| 5 | 2025-11-28 | 0.000737 | 0.000819 | 0.000260 |
| 6 | 2025-12-01 | 0.000737 | 0.000819 | 0.000260 |
| 7 | 2025-12-02 | 0.000737 | 0.000819 | 0.000260 |
| 8 | 2025-12-03 | 0.000737 | 0.000819 | 0.000260 |
| 9 | 2025-12-04 | 0.000737 | 0.000819 | 0.000260 |
📄 CSV: us_retail_rolling_eval.csv
| ticker | RMSE | MAE | |
|---|---|---|---|
| 0 | WMT | 0.013503 | 0.008798 |
| 1 | COST | 0.010649 | 0.008172 |
| 2 | TGT | 0.017805 | 0.013944 |
📄 CSV: us_tech_VAR_coeffs.csv
| Unnamed: 0 | AAPL | MSFT | NVDA | |
|---|---|---|---|---|
| 0 | const | 0.000766 | 0.001104 | 0.002273 |
| 1 | L1.AAPL | 0.001441 | -0.046184 | -0.031821 |
| 2 | L1.MSFT | -0.105998 | -0.151740 | -0.165332 |
| 3 | L1.NVDA | 0.019841 | 0.034317 | 0.001764 |
| 4 | L2.AAPL | 0.008761 | -0.013327 | -0.011733 |
| 5 | L2.MSFT | -0.077339 | -0.058800 | -0.063556 |
| 6 | L2.NVDA | 0.045931 | 0.026213 | 0.057473 |
| 7 | L3.AAPL | -0.082228 | -0.088988 | -0.108386 |
| 8 | L3.MSFT | 0.095441 | 0.047166 | 0.115579 |
| 9 | L3.NVDA | -0.005091 | 0.009845 | -0.012694 |
📄 CSV: us_tech_diagnostics.csv
| Unnamed: 0 | stable | max_root_modulus | serial_pvalue | normality_pvalue | arch_pvalue | dw_mean | sector | n_obs_train | n_vars | chosen_lag | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | us_tech | False | 1.730068 | NaN | NaN | NaN | 1.996385 | us_tech | 2413 | 3 | 9 |
📄 CSV: us_tech_forecast_prices_h10.csv
| Unnamed: 0 | AAPL | MSFT | NVDA | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 267.128394 | 478.730423 | 181.688619 |
| 1 | 2025-11-24 | 267.215133 | 478.337166 | 181.904773 |
| 2 | 2025-11-25 | 266.775108 | 477.680615 | 181.839489 |
| 3 | 2025-11-26 | 267.044446 | 477.808840 | 182.450524 |
| 4 | 2025-11-27 | 267.200126 | 478.600831 | 182.780324 |
| 5 | 2025-11-28 | 267.688967 | 480.125973 | 183.106679 |
| 6 | 2025-12-01 | 267.285250 | 480.268962 | 182.933760 |
| 7 | 2025-12-02 | 267.900427 | 482.052664 | 183.944021 |
| 8 | 2025-12-03 | 268.250598 | 482.707000 | 184.616566 |
| 9 | 2025-12-04 | 268.552026 | 483.243189 | 185.102873 |
📄 CSV: us_tech_forecast_prices_h100.csv
| Unnamed: 0 | AAPL | MSFT | NVDA | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 267.128394 | 478.730423 | 181.688619 |
| 1 | 2025-11-24 | 267.215133 | 478.337166 | 181.904773 |
| 2 | 2025-11-25 | 266.775108 | 477.680615 | 181.839489 |
| 3 | 2025-11-26 | 267.044446 | 477.808840 | 182.450524 |
| 4 | 2025-11-27 | 267.200126 | 478.600831 | 182.780324 |
| 5 | 2025-11-28 | 267.688967 | 480.125973 | 183.106679 |
| 6 | 2025-12-01 | 267.285250 | 480.268962 | 182.933760 |
| 7 | 2025-12-02 | 267.900427 | 482.052664 | 183.944021 |
| 8 | 2025-12-03 | 268.250598 | 482.707000 | 184.616566 |
| 9 | 2025-12-04 | 268.552026 | 483.243189 | 185.102873 |
📄 CSV: us_tech_forecast_prices_h20.csv
| Unnamed: 0 | AAPL | MSFT | NVDA | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 267.128394 | 478.730423 | 181.688619 |
| 1 | 2025-11-24 | 267.215133 | 478.337166 | 181.904773 |
| 2 | 2025-11-25 | 266.775108 | 477.680615 | 181.839489 |
| 3 | 2025-11-26 | 267.044446 | 477.808840 | 182.450524 |
| 4 | 2025-11-27 | 267.200126 | 478.600831 | 182.780324 |
| 5 | 2025-11-28 | 267.688967 | 480.125973 | 183.106679 |
| 6 | 2025-12-01 | 267.285250 | 480.268962 | 182.933760 |
| 7 | 2025-12-02 | 267.900427 | 482.052664 | 183.944021 |
| 8 | 2025-12-03 | 268.250598 | 482.707000 | 184.616566 |
| 9 | 2025-12-04 | 268.552026 | 483.243189 | 185.102873 |
📄 CSV: us_tech_forecast_returns_h10.csv
| Unnamed: 0 | AAPL | MSFT | NVDA | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 0.003294 | 0.000628 | 0.005788 |
| 1 | 2025-11-24 | 0.000325 | -0.000822 | 0.001189 |
| 2 | 2025-11-25 | -0.001648 | -0.001374 | -0.000359 |
| 3 | 2025-11-26 | 0.001009 | 0.000268 | 0.003355 |
| 4 | 2025-11-27 | 0.000583 | 0.001656 | 0.001806 |
| 5 | 2025-11-28 | 0.001828 | 0.003182 | 0.001784 |
| 6 | 2025-12-01 | -0.001509 | 0.000298 | -0.000945 |
| 7 | 2025-12-02 | 0.002299 | 0.003707 | 0.005507 |
| 8 | 2025-12-03 | 0.001306 | 0.001356 | 0.003650 |
| 9 | 2025-12-04 | 0.001123 | 0.001110 | 0.002631 |
📄 CSV: us_tech_forecast_returns_h100.csv
| Unnamed: 0 | AAPL | MSFT | NVDA | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 0.003294 | 0.000628 | 0.005788 |
| 1 | 2025-11-24 | 0.000325 | -0.000822 | 0.001189 |
| 2 | 2025-11-25 | -0.001648 | -0.001374 | -0.000359 |
| 3 | 2025-11-26 | 0.001009 | 0.000268 | 0.003355 |
| 4 | 2025-11-27 | 0.000583 | 0.001656 | 0.001806 |
| 5 | 2025-11-28 | 0.001828 | 0.003182 | 0.001784 |
| 6 | 2025-12-01 | -0.001509 | 0.000298 | -0.000945 |
| 7 | 2025-12-02 | 0.002299 | 0.003707 | 0.005507 |
| 8 | 2025-12-03 | 0.001306 | 0.001356 | 0.003650 |
| 9 | 2025-12-04 | 0.001123 | 0.001110 | 0.002631 |
📄 CSV: us_tech_forecast_returns_h20.csv
| Unnamed: 0 | AAPL | MSFT | NVDA | |
|---|---|---|---|---|
| 0 | 2025-11-21 | 0.003294 | 0.000628 | 0.005788 |
| 1 | 2025-11-24 | 0.000325 | -0.000822 | 0.001189 |
| 2 | 2025-11-25 | -0.001648 | -0.001374 | -0.000359 |
| 3 | 2025-11-26 | 0.001009 | 0.000268 | 0.003355 |
| 4 | 2025-11-27 | 0.000583 | 0.001656 | 0.001806 |
| 5 | 2025-11-28 | 0.001828 | 0.003182 | 0.001784 |
| 6 | 2025-12-01 | -0.001509 | 0.000298 | -0.000945 |
| 7 | 2025-12-02 | 0.002299 | 0.003707 | 0.005507 |
| 8 | 2025-12-03 | 0.001306 | 0.001356 | 0.003650 |
| 9 | 2025-12-04 | 0.001123 | 0.001110 | 0.002631 |
📄 CSV: us_tech_rolling_eval.csv
| ticker | RMSE | MAE | |
|---|---|---|---|
| 0 | AAPL | 0.014857 | 0.010576 |
| 1 | MSFT | 0.011791 | 0.008799 |
| 2 | NVDA | 0.021276 | 0.016575 |
===== TXT summary =====
📜 TXT: rus_fin_VAR_summary.txt
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:05:47
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -22.9311
Nobs: 321.000 HQIC: -23.3969
Log likelihood: 2504.46 FPE: 5.06594e-11
AIC: -23.7065 Det(Omega_mle): 4.15234e-11
--------------------------------------------------------------------
Results for equation SBER.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const -0.000812 0.001116 -0.728 0.466
L1.SBER.ME -0.067880 0.084686 -0.802 0.423
L1.VTBR.ME 0.050555 0.088692 0.570 0.569
L1.TCSG.ME 0.007783 0.040678 0.191 0.848
L2.SBER.ME -0.008981 0.083946 -0.107 0.915
L2.VTBR.ME -0.066138 0.089235 -0.741 0.459
L2.TCSG.ME 0.018146 0.040709 0.446 0.656
L3.SBER.ME -0.172843 0.084269 -2.051 0.040
L3.VTBR.ME -0.015094 0.088766 -0.170 0.865
L3.TCSG.ME 0.177597 0.040526 4.382 0.000
L4.SBER.ME 0.336753 0.084832 3.970 0.000
L4.VTBR.ME -0.215702 0.088069 -2.449 0.014
L4.TCSG.ME -0.067918 0.042639 -1.593 0.111
L5.SBER.ME 0.216999 0.086413 2.511 0.012
L5.VTBR.ME -0.
📜 TXT: rus_met_VAR_summary.txt
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:05:50
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -25.1609
Nobs: 1514.00 HQIC: -25.1873
Log likelihood: 12645.9 FPE: 1.13358e-11
AIC: -25.2031 Det(Omega_mle): 1.12464e-11
--------------------------------------------------------------------
Results for equation NLMK.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const 0.001201 0.000445 2.700 0.007
L1.NLMK.ME -0.094152 0.032943 -2.858 0.004
L1.GMKN.ME 0.019060 0.027577 0.691 0.489
L1.CHMF.ME 0.087085 0.037644 2.313 0.021
=============================================================================
Results for equation GMKN.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const 0.000496 0.000447 1.109 0.267
L1.NLMK.ME 0.009384 0.033126 0.283 0.777
L1.GMKN.ME 0.039199 0.027730 1.414 0.157
L1.CHMF.ME 0.016557 0.037853 0.437 0.662
=============================================================================
Results for equation CHMF.ME
=======================================
📜 TXT: rus_oil_VAR_summary.txt
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:05:49
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -25.5014
Nobs: 1513.00 HQIC: -25.5478
Log likelihood: 12928.1 FPE: 7.81250e-12
AIC: -25.5753 Det(Omega_mle): 7.70506e-12
--------------------------------------------------------------------
Results for equation GAZP.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const 0.000783 0.000392 1.995 0.046
L1.GAZP.ME 0.077234 0.031948 2.417 0.016
L1.LKOH.ME -0.026727 0.031154 -0.858 0.391
L1.ROSN.ME 0.001376 0.032071 0.043 0.966
L2.GAZP.ME 0.027767 0.031925 0.870 0.384
L2.LKOH.ME -0.052866 0.031085 -1.701 0.089
L2.ROSN.ME 0.006264 0.031922 0.196 0.844
=============================================================================
Results for equation LKOH.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const 0.000940 0.000457 2.055 0.040
L1.GAZP.ME -0.011864 0.037231 -0.319 0.750
L1.LKOH.ME -0.040125 0.036307 -1.105
📜 TXT: us_energy_VAR_summary.txt
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:06:10
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -25.7027
Nobs: 2406.00 HQIC: -25.8037
Log likelihood: 20935.4 FPE: 5.86841e-12
AIC: -25.8614 Det(Omega_mle): 5.71033e-12
--------------------------------------------------------------------
Results for equation XOM
=========================================================================
coefficient std. error t-stat prob
-------------------------------------------------------------------------
const 0.000329 0.000359 0.917 0.359
L1.XOM 0.112250 0.039396 2.849 0.004
L1.CVX -0.135727 0.038881 -3.491 0.000
L1.COP -0.012350 0.026862 -0.460 0.646
L2.XOM -0.095013 0.039435 -2.409 0.016
L2.CVX 0.072186 0.038921 1.855 0.064
L2.COP 0.039296 0.026859 1.463 0.143
L3.XOM -0.097785 0.039545 -2.473 0.013
L3.CVX -0.017596 0.038948 -0.452 0.651
L3.COP 0.069671 0.026837 2.596 0.009
L4.XOM 0.051083 0.039649 1.288 0.198
L4.CVX -0.124363 0.038951 -3.193 0.001
L4.COP 0.067654 0.026853 2.519 0.012
L5.XOM 0.047316 0.039589 1.195 0.232
L5.CVX -0.101167 0.039025 -2.592 0.010
L5.COP 0.07
📜 TXT: us_retail_VAR_summary.txt
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:06:03
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -25.4775
Nobs: 2412.00 HQIC: -25.4958
Log likelihood: 20505.2 FPE: 8.37039e-12
AIC: -25.5063 Det(Omega_mle): 8.32889e-12
--------------------------------------------------------------------
Results for equation WMT
==========================================================================
coefficient std. error t-stat prob
--------------------------------------------------------------------------
const 0.000772 0.000276 2.797 0.005
L1.WMT -0.040524 0.025221 -1.607 0.108
L1.COST 0.004596 0.024807 0.185 0.853
L1.TGT -0.034155 0.014933 -2.287 0.022
==========================================================================
Results for equation COST
==========================================================================
coefficient std. error t-stat prob
--------------------------------------------------------------------------
const 0.000845 0.000285 2.960 0.003
L1.WMT 0.000503 0.026094 0.019 0.985
L1.COST -0.032054 0.025666 -1.249 0.212
L1.TGT -0.002398 0.015450 -0.155 0.877
==========================================================================
Results for equation TGT
==========================================================================
coefficien
📜 TXT: us_tech_VAR_summary.txt
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:05:52
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -24.0541
Nobs: 2404.00 HQIC: -24.1827
Log likelihood: 19006.6 FPE: 2.92194e-11
AIC: -24.2562 Det(Omega_mle): 2.82217e-11
--------------------------------------------------------------------
Results for equation AAPL
==========================================================================
coefficient std. error t-stat prob
--------------------------------------------------------------------------
const 0.000766 0.000381 2.008 0.045
L1.AAPL 0.001441 0.028744 0.050 0.960
L1.MSFT -0.105998 0.033638 -3.151 0.002
L1.NVDA 0.019841 0.015473 1.282 0.200
L2.AAPL 0.008761 0.028780 0.304 0.761
L2.MSFT -0.077339 0.033760 -2.291 0.022
L2.NVDA 0.045931 0.015502 2.963 0.003
L3.AAPL -0.082228 0.028683 -2.867 0.004
L3.MSFT 0.095441 0.033679 2.834 0.005
L3.NVDA -0.005091 0.015520 -0.328 0.743
L4.AAPL -0.046314 0.028751 -1.611 0.107
L4.MSFT 0.053454 0.033728 1.585 0.113
L4.NVDA -0.004731 0.015519 -0.305 0.760
L5.AAPL 0.014151 0.028740 0.492 0.622
L5.MSFT 0.012282 0.033731 0.364 0.716
===== PNG визуализации =====
📌 Showing: rus_fin_FEVD.png
📌 Showing: rus_fin_IRF.png
📌 Showing: rus_fin_resid_SBER.ME_hist_acf.png
📌 Showing: rus_fin_resid_TCSG.ME_hist_acf.png
📌 Showing: rus_fin_resid_VTBR.ME_hist_acf.png
📌 Showing: rus_met_FEVD.png
📌 Showing: rus_met_IRF.png
📌 Showing: rus_met_resid_CHMF.ME_hist_acf.png
📌 Showing: rus_met_resid_GMKN.ME_hist_acf.png
📌 Showing: rus_met_resid_NLMK.ME_hist_acf.png
📌 Showing: rus_oil_FEVD.png
📌 Showing: rus_oil_IRF.png
📌 Showing: rus_oil_resid_GAZP.ME_hist_acf.png
📌 Showing: rus_oil_resid_LKOH.ME_hist_acf.png
📌 Showing: rus_oil_resid_ROSN.ME_hist_acf.png
📌 Showing: us_energy_FEVD.png
📌 Showing: us_energy_IRF.png
📌 Showing: us_energy_resid_COP_hist_acf.png
📌 Showing: us_energy_resid_CVX_hist_acf.png
📌 Showing: us_energy_resid_XOM_hist_acf.png
📌 Showing: us_retail_FEVD.png
📌 Showing: us_retail_IRF.png
📌 Showing: us_retail_resid_COST_hist_acf.png
📌 Showing: us_retail_resid_TGT_hist_acf.png
📌 Showing: us_retail_resid_WMT_hist_acf.png
📌 Showing: us_tech_FEVD.png
📌 Showing: us_tech_IRF.png
📌 Showing: us_tech_resid_AAPL_hist_acf.png
📌 Showing: us_tech_resid_MSFT_hist_acf.png
📌 Showing: us_tech_resid_NVDA_hist_acf.png
Проверим тест Йохансена на проверку коинтегрирующих векторов¶
# johansen_per_sector.py
import os
import pandas as pd
import numpy as np
from statsmodels.tsa.vector_ar.vecm import coint_johansen
SECTORS = {
"rus_fin": "rus_fin_prices.csv",
"rus_oil": "rus_oil_prices.csv",
"rus_met": "rus_met_prices.csv",
"us_tech": "us_tech_prices.csv",
"us_retail": "us_retail_prices.csv",
"us_energy": "us_energy_prices.csv"
}
OUT = "VECM_results"
os.makedirs(OUT, exist_ok=True)
def johansen_test_df(df, det_order=0, k_ar_diff=1):
# df must be levels, no NaN
res = coint_johansen(df, det_order, k_ar_diff)
# trace statistics res.lr1, critical values res.cvt
trace = res.lr1
cvt = res.cvt # rows ~ ranks, cols 90/95/99
return res, trace, cvt
summary_rows = []
for name, path in SECTORS.items():
try:
df = pd.read_csv(path, index_col=0, parse_dates=True)
df = df.dropna(how='any')
print(f"\n{name}: shape {df.shape}")
# choose k_ar_diff = lag used in VECM = (lag selected for VAR on returns)
# safe default:
k_ar_diff = 1
res, trace, cvt = johansen_test_df(df, det_order=0, k_ar_diff=k_ar_diff)
# determine rank by comparing trace with 95% critical values
rank = 0
for i, tr in enumerate(trace):
crit95 = cvt[i,1] # 95% (index 1)
if tr > crit95:
rank = i+1
# save summary
summary_rows.append({"sector": name, "n_obs": df.shape[0], "n_vars": df.shape[1], "coint_rank": rank})
# write detailed output
with open(os.path.join(OUT, f"{name}_johansen.txt"), "w", encoding="utf-8") as f:
f.write("Trace stats:\n")
for i, t in enumerate(trace):
f.write(f"r <= {i} : trace_stat = {t:.4f}, crit90={cvt[i,0]}, crit95={cvt[i,1]}, crit99={cvt[i,2]}\n")
f.write("\nEigenvectors (res.evec):\n")
f.write(str(res.evec))
print(f"{name} -> estimated coint_rank = {rank}")
except Exception as e:
print("Failed", name, e)
pd.DataFrame(summary_rows).to_csv(os.path.join(OUT, "johansen_summary.csv"), index=False)
print("\nSaved johansen_summary.csv")
rus_fin: shape (626, 3) rus_fin -> estimated coint_rank = 0 rus_oil: shape (1616, 3) rus_oil -> estimated coint_rank = 0 rus_met: shape (1616, 3)
rus_met -> estimated coint_rank = 0 us_tech: shape (2514, 3) us_tech -> estimated coint_rank = 1 us_retail: shape (2514, 3) us_retail -> estimated coint_rank = 0 us_energy: shape (2514, 3) us_energy -> estimated coint_rank = 0 Saved johansen_summary.csv
Построим VECM¶
# build_vecm_for_ranked.py
import os
import pandas as pd
import numpy as np
from statsmodels.tsa.vector_ar.vecm import VECM
from datetime import datetime
IN_DIR = "VECM_results"
os.makedirs(IN_DIR, exist_ok=True)
johansen_summary = pd.read_csv(os.path.join(IN_DIR, "johansen_summary.csv"))
for row in johansen_summary.itertuples(index=False):
name = row.sector
rank = int(row.coint_rank)
if rank <= 0:
print(f"{name}: no cointegration (rank={rank}), skipping VECM.")
continue
# load data
df = pd.read_csv(f"{name}_prices.csv", index_col=0, parse_dates=True)
df = df.dropna(how='any')
print(f"Building VECM for {name}, shape {df.shape}, rank {rank}")
# choose lag differences (k_ar_diff) — can be tuned; use 1 as default
k_ar_diff = 1
try:
vecm = VECM(df, k_ar_diff=k_ar_diff, coint_rank=rank, deterministic='ci')
vecm_res = vecm.fit()
# save summary
with open(os.path.join(IN_DIR, f"{name}_VECM_summary.txt"), "w", encoding="utf-8") as f:
try:
f.write(str(vecm_res.summary()))
except:
f.write("VECM fitted. Summary not printable.\n")
# save alpha and beta
pd.DataFrame(vecm_res.alpha, index=df.columns).to_csv(os.path.join(IN_DIR, f"{name}_vecm_alpha.csv"))
pd.DataFrame(vecm_res.beta, index=df.columns).to_csv(os.path.join(IN_DIR, f"{name}_vecm_beta.csv"))
print(f"Saved VECM results for {name}")
# Forecasting: steps 10/20/100
steps_list = [10,20,100]
for steps in steps_list:
fc = vecm_res.predict(steps=steps)
start = df.index[-1] + pd.Timedelta(days=1)
idx = pd.date_range(start=start, periods=h, freq='B')
fc_df = pd.DataFrame(fc, index=idx, columns=df.columns)
fc_df.to_csv(os.path.join(IN_DIR, f"{name}_vecm_forecast_h{steps}.csv"))
except Exception as e:
print(f"VECM fit failed for {name}: {e}")
rus_fin: no cointegration (rank=0), skipping VECM. rus_oil: no cointegration (rank=0), skipping VECM. rus_met: no cointegration (rank=0), skipping VECM. Building VECM for us_tech, shape (2514, 3), rank 1
Saved VECM results for us_tech VECM fit failed for us_tech: Shape of passed values is (10, 3), indices imply (100, 3) us_retail: no cointegration (rank=0), skipping VECM. us_energy: no cointegration (rank=0), skipping VECM.
show_csvs(dir='VECM_results')
show_txts(dir='VECM_results')
===== CSV таблицы ===== 📄 CSV: johansen_summary.csv
| sector | n_obs | n_vars | coint_rank | |
|---|---|---|---|---|
| 0 | rus_fin | 626 | 3 | 0 |
| 1 | rus_oil | 1616 | 3 | 0 |
| 2 | rus_met | 1616 | 3 | 0 |
| 3 | us_tech | 2514 | 3 | 1 |
| 4 | us_retail | 2514 | 3 | 0 |
| 5 | us_energy | 2514 | 3 | 0 |
📄 CSV: us_tech_vecm_alpha.csv
| Unnamed: 0 | 0 | |
|---|---|---|
| 0 | AAPL | -0.014304 |
| 1 | MSFT | -0.013418 |
| 2 | NVDA | -0.011221 |
📄 CSV: us_tech_vecm_beta.csv
| Unnamed: 0 | 0 | |
|---|---|---|
| 0 | AAPL | 1.000000 |
| 1 | MSFT | -0.605214 |
| 2 | NVDA | 0.159978 |
===== TXT summary =====
📜 TXT: rus_fin_johansen.txt
Trace stats:
r <= 0 : trace_stat = 20.5218, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 3.9606, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 1.9101, crit90=2.7055, crit95=3.8415, crit99=6.6349
Eigenvectors (res.evec):
[[ 3.18679748e-02 -2.96416098e-03 -1.09273064e-02]
[-2.22903588e-03 2.58130983e-03 -3.37488058e-03]
[-1.14552177e-03 -4.38971114e-05 -2.92686747e-04]]
📜 TXT: rus_met_johansen.txt
Trace stats:
r <= 0 : trace_stat = 20.1479, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 8.1063, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 1.4505, crit90=2.7055, crit95=3.8415, crit99=6.6349
Eigenvectors (res.evec):
[[ 9.64932376e-02 5.42679164e-02 -2.24327732e-02]
[ 8.70301883e-05 -3.53601350e-04 1.06998467e-04]
[-1.68966780e-02 -2.59815708e-03 4.85229896e-03]]
📜 TXT: rus_oil_johansen.txt
Trace stats:
r <= 0 : trace_stat = 20.8228, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 5.6565, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 1.0505, crit90=2.7055, crit95=3.8415, crit99=6.6349
Eigenvectors (res.evec):
[[ 0.01626521 0.0204432 0.0288041 ]
[ 0.00113576 -0.00155161 -0.00058345]
[-0.02962675 0.00249476 -0.00402397]]
📜 TXT: us_energy_johansen.txt
Trace stats:
r <= 0 : trace_stat = 21.8316, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 8.5973, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 2.7075, crit90=2.7055, crit95=3.8415, crit99=6.6349
Eigenvectors (res.evec):
[[ 0.05489743 0.02398857 0.09237557]
[-0.13488873 0.01556991 0.00956623]
[ 0.09913617 -0.06809139 -0.07382377]]
📜 TXT: us_retail_johansen.txt
Trace stats:
r <= 0 : trace_stat = 21.1804, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 5.7589, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 0.8685, crit90=2.7055, crit95=3.8415, crit99=6.6349
Eigenvectors (res.evec):
[[ 0.10587704 0.06328421 -0.13665006]
[-0.01226271 -0.00411667 0.00913947]
[ 0.01191872 -0.01891335 -0.00719376]]
📜 TXT: us_tech_VECM_summary.txt
Det. terms outside the coint. relation & lagged endog. parameters for equation AAPL
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
L1.AAPL 0.0604 0.025 2.438 0.015 0.012 0.109
L1.MSFT -0.0391 0.015 -2.541 0.011 -0.069 -0.009
L1.NVDA -0.0279 0.029 -0.948 0.343 -0.086 0.030
Det. terms outside the coint. relation & lagged endog. parameters for equation MSFT
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
L1.AAPL -0.0854 0.043 -2.003 0.045 -0.169 -0.002
L1.MSFT -0.0666 0.026 -2.516 0.012 -0.119 -0.015
L1.NVDA 0.1193 0.051 2.357 0.018 0.020 0.218
Det. terms outside the coint. relation & lagged endog. parameters for equation NVDA
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
L1.AAPL 0.0129 0.019 0.689 0.491 -0.024 0.050
L1.MSFT -0.0110 0.012 -0.938 0.348 -0.034 0.012
L1.NVDA -0.0992 0.022 -4.447 0.000 -0.143 -0.055
Loading coefficients (alpha) for equation AAPL
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ec1
📜 TXT: us_tech_johansen.txt
Trace stats:
r <= 0 : trace_stat = 46.6903, crit90=27.0669, crit95=29.7961, crit99=35.4628
r <= 1 : trace_stat = 4.7246, crit90=13.4294, crit95=15.4943, crit99=19.9349
r <= 2 : trace_stat = 1.8763, crit90=2.7055, crit95=3.8415, crit99=6.6349
Eigenvectors (res.evec):
[[ 0.06196515 -0.01950739 -0.02208805]
[-0.03742795 -0.00276564 0.01660908]
[ 0.00935774 0.02621031 -0.03264757]]
Подгон VARMAX с экзогенными переменными¶
# varmax_example.py
import yfinance as yf
import pandas as pd
import numpy as np
from statsmodels.tsa.statespace.varmax import VARMAX
import os
# пример для rus_fin
sector_file = "rus_fin_prices.csv"
df_prices = pd.read_csv(sector_file, index_col=0, parse_dates=True).dropna(how='any')
# скачиваем экзогены
exogs = yf.download(["IMOEX.ME", "BRN-USD"], start=df_prices.index[0].strftime("%Y-%m-%d"), end=df_prices.index[-1].strftime("%Y-%m-%d"), progress=False, threads=False)
# Если названия не совпали, подбирай нужные тикеры. В примере может потребоваться заменить ticker на корректный.
# Преобразуем в лог-доходности (exog)
exog_prices = exogs['Adj Close'] if 'Adj Close' in exogs else exogs['Close']
exog_prices = exog_prices.dropna(how='all')
exog_returns = np.log(exog_prices).diff().reindex(df_prices.index).dropna(how='any')
# целевая переменная — лог-доходности сектора
df_returns = np.log(df_prices).diff().dropna(how='any')
# приведём экзогены и endog к одному индексу
common_idx = df_returns.index.intersection(exog_returns.index)
endog = df_returns.loc[common_idx]
exog = exog_returns.loc[common_idx]
# модель VARMAX(p,q) — начинаем с (1,1)
mod = VARMAX(endog, exog=exog, order=(1,1))
res = mod.fit(maxiter=200, disp=False)
print(res.summary())
# Forecast (пример 20)
fc = res.get_forecast(steps=20, exog=exog.iloc[-1:].values.repeat(20, axis=0))
pred = fc.predicted_mean
pred.to_csv(os.path.join("VARMAX_results", "rus_fin_varmax_forecast_h20.csv"))
Statespace Model Results
=============================================================================================
Dep. Variable: ['SBER.ME', 'VTBR.ME', 'TCSG.ME'] No. Observations: 93
Model: VARMAX(1,1) Log Likelihood 836.162
+ intercept AIC -1606.324
Date: Sun, 23 Nov 2025 BIC -1522.748
Time: 21:08:34 HQIC -1572.579
Sample: 0
- 93
Covariance Type: opg
===================================================================================
Ljung-Box (L1) (Q): 0.00, 0.01, 0.00 Jarque-Bera (JB): 0.35, 1.19, 1.48
Prob(Q): 0.99, 0.92, 0.99 Prob(JB): 0.84, 0.55, 0.48
Heteroskedasticity (H): 0.71, 0.80, 1.57 Skew: 0.01, 0.09, -0.17
Prob(H) (two-sided): 0.35, 0.54, 0.21 Kurtosis: 2.70, 2.48, 2.48
Results for equation SBER.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.SBER 0.0006 0.001 0.456 0.648 -0.002 0.003
L1.SBER.ME.SBER -0.2008 0.452 -0.444 0.657 -1.088 0.686
L1.VTBR.ME.SBER 0.1947 0.452 0.431 0.667 -0.692 1.081
L1.TCSG.ME.SBER 0.0200 0.220 0.091 0.927 -0.411 0.451
L1.e(SBER.ME).SBER 0.0333 0.484 0.069 0.945 -0.915 0.982
L1.e(VTBR.ME).SBER 0.0436 0.460 0.095 0.925 -0.858 0.945
L1.e(TCSG.ME).SBER 0.0237 0.210 0.113 0.910 -0.388 0.435
beta.BRN-USD.SBER -0.0191 0.031 -0.616 0.538 -0.080 0.042
beta.IMOEX.ME.SBER 1.2744 0.141 9.028 0.000 0.998 1.551
Results for equation VTBR.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.VTBR -7.402e-05 0.001 -0.062 0.951 -0.002 0.002
L1.SBER.ME.VTBR 0.0528 0.397 0.133 0.894 -0.726 0.832
L1.VTBR.ME.VTBR 0.0683 0.396 0.173 0.863 -0.707 0.844
L1.TCSG.ME.VTBR 0.0144 0.250 0.058 0.954 -0.475 0.504
L1.e(SBER.ME).VTBR 0.0221 0.413 0.054 0.957 -0.786 0.831
L1.e(VTBR.ME).VTBR -0.0354 0.426 -0.083 0.934 -0.871 0.800
L1.e(TCSG.ME).VTBR -0.0154 0.249 -0.062 0.951 -0.504 0.473
beta.BRN-USD.VTBR -0.0090 0.027 -0.337 0.736 -0.061 0.043
beta.IMOEX.ME.VTBR 1.1525 0.129 8.941 0.000 0.900 1.405
Results for equation TCSG.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.TCSG 0.0007 0.003 0.226 0.821 -0.005 0.007
L1.SBER.ME.TCSG -0.4424 0.897 -0.493 0.622 -2.200 1.315
L1.VTBR.ME.TCSG 0.5069 0.851 0.596 0.551 -1.161 2.174
L1.TCSG.ME.TCSG 0.0603 0.566 0.106 0.915 -1.050 1.170
L1.e(SBER.ME).TCSG 0.2572 0.927 0.277 0.781 -1.560 2.074
L1.e(VTBR.ME).TCSG 0.1114 0.922 0.121 0.904 -1.696 1.918
L1.e(TCSG.ME).TCSG 0.0543 0.582 0.093 0.926 -1.086 1.194
beta.BRN-USD.TCSG 0.0500 0.059 0.844 0.399 -0.066 0.166
beta.IMOEX.ME.TCSG 0.7670 0.309 2.479 0.013 0.161 1.374
Error covariance matrix
============================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------------
sqrt.var.SBER.ME 0.0092 0.001 10.791 0.000 0.008 0.011
sqrt.cov.SBER.ME.VTBR.ME 0.0024 0.001 1.657 0.097 -0.000 0.005
sqrt.var.VTBR.ME 0.0095 0.001 9.418 0.000 0.008 0.011
sqrt.cov.SBER.ME.TCSG.ME -0.0007 0.003 -0.246 0.806 -0.007 0.005
sqrt.cov.VTBR.ME.TCSG.ME -0.0031 0.003 -1.098 0.272 -0.009 0.002
sqrt.var.TCSG.ME 0.0202 0.002 8.521 0.000 0.016 0.025
============================================================================================
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
Сравнение моделей¶
Возьмем скользяющую ошибку RMSE
# rolling_var_evaluation.py
import os
import warnings
warnings.filterwarnings("ignore")
import pandas as pd
import numpy as np
from statsmodels.tsa.vector_ar.var_model import VAR
from sklearn.metrics import mean_squared_error, mean_absolute_error
from tqdm.auto import tqdm # если не установлено, можно убрать tqdm
# ============= SETTINGS =============
SECTOR_FILES = {
"rus_fin": "rus_fin_prices.csv",
"rus_oil": "rus_oil_prices.csv",
"rus_met": "rus_met_prices.csv",
"us_tech": "us_tech_prices.csv",
"us_retail": "us_retail_prices.csv",
"us_energy": "us_energy_prices.csv"
}
OUT_DIR = "rolling_results"
os.makedirs(OUT_DIR, exist_ok=True)
MIN_TRAIN_SIZE = 250 # минимальный размер тренировки до старта rolling (в рабочих днях)
HOLDOUT_DAYS = 200 # сколько дней оценивать (длина rolling / теста)
MAX_LAGS = 8 # макс лагов для select_order (на первом шаге)
FIX_LAG = None # если хочешь принудительно задать lag -> укажи целое (напр. 2), иначе None
USE_EXPANDING = True # True -> expanding window; False -> moving window
ROLLING_WINDOW_SIZE = 500 # если USE_EXPANDING=False, размер окна
# ============= FUNCTIONS =============
def prepare_returns(df_prices):
df_prices = df_prices.sort_index()
returns = np.log(df_prices).diff().dropna(how='all')
returns = returns.dropna(how='any') # VAR требует полных строк
return returns
def fit_var_and_forecast_one_step(train_returns, lag):
"""
Fit VAR(lag) on train_returns (DataFrame) and forecast 1 step ahead (returns).
Returns predicted 1-step array (shape = n_vars,)
"""
model = VAR(train_returns)
res = model.fit(lag)
# use last k_ar observations from train_returns
y_input = train_returns.values[-res.k_ar:]
fc = res.forecast(y=y_input, steps=1) # returns array shape (1, nvars)
return fc.ravel(), res
# ============= MAIN LOOP per sector =============
overall_summary = []
for sector, file in SECTOR_FILES.items():
print("\n" + "="*80)
print("Sector:", sector)
print("="*80)
# load prices (levels)
df_prices = pd.read_csv(file, index_col=0, parse_dates=True)
df_prices = df_prices.sort_index()
# prepare returns
df_ret = prepare_returns(df_prices)
n = df_ret.shape[0]
if n < MIN_TRAIN_SIZE + 10:
print(f"Not enough obs for sector {sector}: {n} < MIN_TRAIN_SIZE + 10, skipping.")
continue
# define start index for rolling: leave last HOLDOUT_DAYS for evaluation OR use MIN_TRAIN_SIZE
start_idx = max(0, n - HOLDOUT_DAYS - MIN_TRAIN_SIZE)
# we'll iterate from train_end = start_idx + MIN_TRAIN_SIZE - 1 up to n-2 (so that 1-step ahead exists)
first_train_end = start_idx + MIN_TRAIN_SIZE - 1
last_train_end = n - 2 # last index position where 1-step ahead exists
train_end_positions = list(range(first_train_end, last_train_end + 1))
m = len(train_end_positions)
print(f"Total rolling steps: {m} (from pos {first_train_end} to {last_train_end})")
# determine lag: either FIX_LAG or select_order on the initial training window
if FIX_LAG is not None:
chosen_lag = int(FIX_LAG)
print("Using fixed lag:", chosen_lag)
else:
# initial training window
init_train = df_ret.iloc[: first_train_end + 1] # inclusive
try:
sel = VAR(init_train).select_order(MAX_LAGS)
chosen_lag = sel.selected_orders.get('aic', None) or sel.selected_orders.get('bic', None) or 1
if pd.isna(chosen_lag):
chosen_lag = 1
chosen_lag = int(chosen_lag)
except Exception as e:
print("select_order failed on init window, fallback to lag=1. Error:", e)
chosen_lag = 1
print("Chosen lag from initial selection:", chosen_lag)
# storage for predictions and true values
cols = df_ret.columns.tolist()
preds = pd.DataFrame(index=[df_ret.index[i+1] for i in train_end_positions], columns=cols, dtype=float)
trues = pd.DataFrame(index=preds.index, columns=cols, dtype=float)
# rolling loop
for i, train_end_pos in enumerate(tqdm(train_end_positions, desc=f"Rolling {sector}")):
# define train window indices
if USE_EXPANDING:
train_start_pos = 0
else:
train_start_pos = max(0, train_end_pos - (ROLLING_WINDOW_SIZE - 1))
train_df = df_ret.iloc[train_start_pos: train_end_pos + 1] # inclusive
test_pos = train_end_pos + 1
test_date = df_ret.index[test_pos]
# ensure enough observations to fit VAR(chosen_lag)
if train_df.shape[0] <= chosen_lag:
# not enough obs, fill NaN and continue
preds.loc[test_date] = np.nan
trues.loc[test_date] = df_ret.iloc[test_pos].values
continue
try:
yhat, fitted = fit_var_and_forecast_one_step(train_df, chosen_lag)
preds.loc[test_date] = yhat
trues.loc[test_date] = df_ret.iloc[test_pos].values
except Exception as e:
# on error, log NaNs
print(f"Step {i} error at date {test_date}: {e}")
preds.loc[test_date] = np.nan
trues.loc[test_date] = df_ret.iloc[test_pos].values
# drop rows where preds are all NaN (failed steps)
valid_mask = ~preds.isna().all(axis=1)
preds = preds.loc[valid_mask]
trues = trues.loc[valid_mask]
# compute per-ticker RMSE/MAE
rmse_per = ((preds - trues)**2).mean().apply(np.sqrt)
mae_per = (preds - trues).abs().mean()
summary_df = pd.DataFrame({
"RMSE": rmse_per,
"MAE": mae_per
})
summary_df.to_csv(os.path.join(OUT_DIR, f"{sector}_rolling_errors_per_ticker.csv"))
# overall aggregated metrics (mean across tickers)
overall_rmse = rmse_per.mean()
overall_mae = mae_per.mean()
# save preds and trues
preds.to_csv(os.path.join(OUT_DIR, f"{sector}_rolling_preds.csv"))
trues.to_csv(os.path.join(OUT_DIR, f"{sector}_rolling_true.csv"))
summary_df.to_csv(os.path.join(OUT_DIR, f"{sector}_rolling_summary.csv"))
# Save overall summary row
overall_summary.append({
"sector": sector,
"n_steps": preds.shape[0],
"chosen_lag": chosen_lag,
"overall_RMSE_mean": overall_rmse,
"overall_MAE_mean": overall_mae
})
# Plot aggregated errors over time (mean absolute error across tickers per date)
mad_series = (preds - trues).abs().mean(axis=1)
rmse_series = np.sqrt(((preds - trues)**2).mean(axis=1))
# plot MAE over time
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1,2, figsize=(14,4))
ax[0].plot(mad_series.index, mad_series.values)
ax[0].set_title(f"{sector} - Mean Absolute Error (one-step) over time")
ax[0].set_xlabel("Date")
ax[0].set_ylabel("MAE")
ax[0].grid(True)
ax[1].plot(rmse_series.index, rmse_series.values)
ax[1].set_title(f"{sector} - RMSE (one-step) over time")
ax[1].set_xlabel("Date")
ax[1].set_ylabel("RMSE")
ax[1].grid(True)
plt.tight_layout()
fig_path = os.path.join(OUT_DIR, f"{sector}_rolling_error_timeseries.png")
fig.savefig(fig_path, bbox_inches='tight', dpi=150)
plt.show()
plt.close(fig)
# Save overall summary
overall_df = pd.DataFrame(overall_summary)
overall_df.to_csv(os.path.join(OUT_DIR, "rolling_overall_summary.csv"), index=False)
print("\nRolling evaluation completed. Results are in folder:", OUT_DIR)
================================================================================ Sector: rus_fin ================================================================================ Total rolling steps: 178 (from pos 249 to 426) Chosen lag from initial selection: 7
Rolling rus_fin: 100%|██████████| 178/178 [00:01<00:00, 139.18it/s]
================================================================================ Sector: rus_oil ================================================================================ Total rolling steps: 200 (from pos 1414 to 1613) Chosen lag from initial selection: 1
Rolling rus_oil: 100%|██████████| 200/200 [00:01<00:00, 153.86it/s]
================================================================================ Sector: rus_met ================================================================================ Total rolling steps: 200 (from pos 1414 to 1613) Chosen lag from initial selection: 1
Rolling rus_met: 100%|██████████| 200/200 [00:02<00:00, 97.75it/s]
================================================================================ Sector: us_tech ================================================================================ Total rolling steps: 200 (from pos 2312 to 2511) Chosen lag from initial selection: 8
Rolling us_tech: 100%|██████████| 200/200 [00:08<00:00, 24.25it/s]
================================================================================ Sector: us_retail ================================================================================ Total rolling steps: 200 (from pos 2312 to 2511) Chosen lag from initial selection: 1
Rolling us_retail: 100%|██████████| 200/200 [00:02<00:00, 87.40it/s]
================================================================================ Sector: us_energy ================================================================================ Total rolling steps: 200 (from pos 2312 to 2511) Chosen lag from initial selection: 7
Rolling us_energy: 100%|██████████| 200/200 [00:06<00:00, 29.06it/s]
Rolling evaluation completed. Results are in folder: rolling_results
Сделаем прогноз в будущее
import os
import warnings
warnings.filterwarnings("ignore")
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.vector_ar.var_model import VAR
# ---------------- SETTINGS (подправь пути при необходимости) ----------------
SECTORS = {
"rus_fin": "rus_fin_prices.csv",
"rus_oil": "rus_oil_prices.csv",
"rus_met": "rus_met_prices.csv",
"us_tech": "us_tech_prices.csv",
"us_retail": "us_retail_prices.csv",
"us_energy": "us_energy_prices.csv"
}
OUT_DIR = "FORECASTS"
OUT_VAR = os.path.join(OUT_DIR, "VAR")
os.makedirs(OUT_VAR, exist_ok=True)
# параметры по умолчанию
HOLDOUT_DAYS = 60 # размер тестовой части (backtest). Поменяй как нужно
MAX_LAGS_VAR = 8
DEFAULT_VAR_LAG = 2
PLOT_XLIM_LEFT = pd.to_datetime("2005-03-01") # пример из твоего фрагмента
# ---------------- utility functions ----------------
def ensure_numeric_df(df):
df2 = df.copy()
for c in df2.columns:
df2[c] = pd.to_numeric(df2[c], errors='coerce')
df2 = df2.dropna(axis=1, how='all')
return df2
def compute_log_returns(df_prices):
return np.log(df_prices).diff().dropna(how='any')
def make_future_index(last_index, h):
start = last_index[-1] + pd.Timedelta(days=1)
try:
idx = pd.bdate_range(start=start, periods=h)
except Exception:
idx = pd.date_range(start=start, periods=h, freq='B')
return idx
def safe_reconstruct_prices_from_log_returns(last_prices, fc_returns_df):
"""
last_prices: pd.Series (last observed prices)
fc_returns_df: DataFrame of forecasted log-returns (index = future dates)
returns DataFrame of forecasted prices
"""
common = [c for c in fc_returns_df.columns if c in last_prices.index]
if len(common) == 0:
raise ValueError("No matching tickers between last_prices and fc_returns_df")
if len(common) < len(fc_returns_df.columns):
missing = [c for c in fc_returns_df.columns if c not in common]
print("⚠ Missing last_prices for tickers:", missing)
fc = fc_returns_df[common].copy()
prev = last_prices[common].astype(float).copy()
prices = []
for i in range(len(fc)):
r = fc.iloc[i]
next_price = prev * np.exp(r)
prices.append(next_price.copy())
prev = next_price
return pd.DataFrame(prices, index=fc.index, columns=fc.columns)
def var_forecast_plot_prices(sector, csv_path, holdout_days=60, maxlags=8):
print(f"=== SECTOR: {sector} ===")
# ---------- LOAD ----------
df_prices = pd.read_csv(csv_path, index_col=0, parse_dates=True)
df_prices = df_prices.sort_index()
df_prices = df_prices.apply(pd.to_numeric, errors="coerce").dropna()
# ---------- RETURNS ----------
df_ret = np.log(df_prices).diff().dropna()
# ---------- TRAIN / TEST ----------
train_ret = df_ret.iloc[:-holdout_days]
test_ret = df_ret.iloc[-holdout_days:]
train_prices = df_prices.loc[train_ret.index]
test_prices = df_prices.loc[test_ret.index]
last_price = train_prices.iloc[-1]
# ---------- FIT VAR ----------
model = VAR(train_ret)
try:
sel = model.select_order(maxlags)
lag = sel.selected_orders.get('aic') or sel.selected_orders.get('bic') or 2
lag = int(lag)
except:
lag = 2
print("Chosen lag:", lag)
res = model.fit(lag)
# ---------- FORECAST RETURNS ----------
steps = len(test_ret)
fc_ret = res.forecast(train_ret.values[-res.k_ar:], steps=steps)
fc_ret_df = pd.DataFrame(fc_ret, index=test_ret.index, columns=train_ret.columns)
# ---------- CONVERT RETURNS BACK TO PRICES ----------
fc_prices = []
prev = last_price.copy()
for t in range(steps):
new_price = prev * np.exp(fc_ret_df.iloc[t])
fc_prices.append(new_price)
prev = new_price
fc_prices_df = pd.DataFrame(fc_prices, index=test_ret.index, columns=train_ret.columns)
# ---------- PLOT (like your example) ----------
for col in train_ret.columns:
fig, ax = plt.subplots(figsize=(15, 2))
# История
train_prices[col].plot(ax=ax, color="black")
# Прогноз
fc_prices_df[col].rename(col+"-VAR").plot(ax=ax, linestyle="--", color="blue")
# Склейка
ax.scatter([train_prices.index[-1]], [train_prices[col].iloc[-1]], color="red", s=30)
ax.set_xlim(left=pd.to_datetime("2005-03-01"))
ax.set_title(f"{sector} — {col} | VAR forecast")
ax.grid(True)
plt.show()
return {
"train": train_prices,
"test": test_prices,
"fc_prices": fc_prices_df,
"fc_returns": fc_ret_df,
"var_model": res
}
# ---------------- core: train/forecast/plot for one sector ----------------
def var_forecast_and_plot(sector_name, csv_path,
holdout_days=HOLDOUT_DAYS,
maxlags=MAX_LAGS_VAR,
default_lag=DEFAULT_VAR_LAG,
save_out=True,
show_plots=True,
xlim_left=PLOT_XLIM_LEFT):
"""
1) Reads csv_path (prices levels)
2) Builds log-returns, splits into train/test (last holdout_days rows -> test)
3) Fits VAR on train returns, selects lag by AIC (if possible)
4) Forecasts returns for len(test) steps
5) Reconstructs forecast prices, saves CSVs and plots per-ticker
"""
# -- load and prepare --
df_prices = pd.read_csv(csv_path, index_col=0, parse_dates=True)
df_prices = ensure_numeric_df(df_prices).sort_index()
if df_prices.shape[1] < 2:
print(f"Sector {sector_name}: need >=2 series, got {df_prices.shape[1]}. Skipping.")
return None
# compute returns
df_ret = compute_log_returns(df_prices)
if df_ret.shape[0] < 30:
print(f"Sector {sector_name}: too few returns observations ({df_ret.shape[0]}). Skipping.")
return None
# split train / test by last holdout_days
if holdout_days is None or holdout_days <= 0 or holdout_days >= df_ret.shape[0]:
# use default small test (10) if invalid
holdout_days = min(10, max(1, df_ret.shape[0]//10))
train_ret = df_ret.iloc[:-holdout_days]
test_ret = df_ret.iloc[-holdout_days:]
# also get train prices (levels) for plotting and last_price
train_prices = df_prices.loc[train_ret.index]
test_prices = df_prices.loc[test_ret.index]
last_price_for_reconstruct = train_prices.iloc[-1]
# -- select lag --
try:
sel = VAR(train_ret).select_order(maxlags)
chosen = sel.selected_orders.get('aic') or sel.selected_orders.get('bic') or default_lag
lag = int(chosen) if (pd.notna(chosen)) else default_lag
if lag < 1:
lag = default_lag
except Exception:
lag = default_lag
print(f"{sector_name}: chosen VAR lag = {lag}")
# -- fit VAR --
model = VAR(train_ret)
try:
var_res = model.fit(lag)
except Exception as e:
print(f"{sector_name}: VAR fit failed: {e}")
return None
# -- forecast returns for test horizon --
steps = len(test_ret)
fc_array = var_res.forecast(y=train_ret.values[-var_res.k_ar:], steps=steps)
# align index with test_ret index (same as example)
fc_returns = pd.DataFrame(fc_array, index=test_ret.index, columns=train_ret.columns)
fc_returns = fc_returns.rename(columns={c: c + "-VAR" for c in fc_returns.columns})
# -- reconstruct forecast prices from last train price --
# NOTE: fc_returns columns currently have suffix '-VAR'; create mapping back to original
orig_cols = [c.replace("-VAR", "") for c in fc_returns.columns]
fc_returns_no_suffix = fc_returns.copy()
fc_returns_no_suffix.columns = orig_cols
fc_prices = safe_reconstruct_prices_from_log_returns(last_price_for_reconstruct, fc_returns_no_suffix)
# add suffix to price columns to save
fc_prices.columns = [c + "-VAR" for c in fc_prices.columns]
# -- save CSVs --
if save_out:
base = f"{sector_name}_VAR_h{steps}"
fc_returns.to_csv(os.path.join(OUT_VAR, base + "_returns.csv"))
fc_prices.to_csv(os.path.join(OUT_VAR, base + "_prices.csv"))
print(f"Saved: {os.path.join(OUT_VAR, base + '_returns.csv')}")
print(f"Saved: {os.path.join(OUT_VAR, base + '_prices.csv')}")
# -- plotting (per-ticker) --
# We'll plot history = train_prices (full train), then dashed forecast starting from the next date after last train price.
for orig_col in train_ret.columns:
col_var = orig_col + "-VAR"
fig, ax = plt.subplots(figsize=(15, 2))
# combine: show train_prices (levels) and forecast prices (levels)
# build a concat DataFrame with column names: orig_col (history) and col_var (forecast)
# to match your example, we plot train (history) and forecast (dashed)
hist_series = pd.DataFrame(train_prices[[orig_col]])
fc_series = pd.DataFrame(fc_prices[[col_var]])
combined = pd.concat([hist_series, fc_series], axis=1)
# plot
combined.plot(ax=ax, legend=False)
# set xlim if possible
try:
ax.set_xlim(left=xlim_left)
except Exception:
pass
ax.set_xlabel("")
ax.set_title(f"{sector_name} — {orig_col} — VAR forecast (h={steps})")
# tighten visually
plt.tight_layout()
if show_plots:
plt.show()
# save per-plot
fname = os.path.join(OUT_VAR, f"{sector_name}_{orig_col}_VAR_h{steps}.png")
fig.savefig(fname, bbox_inches='tight', dpi=150)
plt.close(fig)
# Also return dict with results
return {"var_res": var_res, "train_ret": train_ret, "test_ret": test_ret,
"fc_returns": fc_returns, "fc_prices": fc_prices, "lag": lag}
for s, path in SECTORS.items():
print("Running sector:", s)
var_forecast_plot_prices(s, path, holdout_days=HOLDOUT_DAYS)
Running sector: rus_fin === SECTOR: rus_fin === Chosen lag: 4
Running sector: rus_oil === SECTOR: rus_oil === Chosen lag: 1
Running sector: rus_met === SECTOR: rus_met === Chosen lag: 2
Running sector: us_tech === SECTOR: us_tech === Chosen lag: 1
Running sector: us_retail === SECTOR: us_retail === Chosen lag: 2
Running sector: us_energy === SECTOR: us_energy === Chosen lag: 7
Не прошла, попробуем снова)
import numpy as np
import pandas as pd
from statsmodels.tsa.vector_ar.var_model import VAR
from statsmodels.tsa.vector_ar.vecm import VECM, coint_johansen
from statsmodels.tsa.statespace.varmax import VARMAX
from sklearn.metrics import mean_squared_error, mean_absolute_error
import matplotlib.pyplot as plt
# -------------------- HELPER UTILS --------------------
def compute_returns(df_prices):
return np.log(df_prices).diff().dropna()
def reconstruct_prices(last_price, forecast_returns):
"""Преобразование лог-доходностей в цены по шагам."""
prices = []
prev = last_price.copy()
for t in range(len(forecast_returns)):
prev = prev * np.exp(forecast_returns.iloc[t])
prices.append(prev.copy())
return pd.DataFrame(prices, index=forecast_returns.index, columns=forecast_returns.columns)
def future_index(start_date, steps):
"""Создаёт h будущих торговых дней."""
return pd.bdate_range(start=start_date + pd.Timedelta(days=1), periods=steps)
def rmse(y_true, y_pred):
return np.sqrt(mean_squared_error(y_true, y_pred))
def mae(y_true, y_pred):
return mean_absolute_error(y_true, y_pred)
def monte_carlo_VAR(var_res, last_price, h, returns_train, n_sims=200):
"""
Выполняет Monte Carlo симуляцию VAR:
- вытягивает средний прогноз
- добавляет шум N(0, Σ_u)
- реконструирует N траекторий цен
"""
Σ = var_res.sigma_u
k = Σ.shape[0]
mean_fc = var_res.forecast(returns_train.values[-var_res.k_ar:], steps=h)
mean_fc_df = pd.DataFrame(mean_fc, index=future_index(returns_train.index[-1], h), columns=returns_train.columns)
sims = np.zeros((n_sims, h, k))
last = last_price.values
for s in range(n_sims):
path = last.copy()
for t in range(h):
shock = np.random.multivariate_normal(np.zeros(k), Σ)
r = mean_fc[t] + shock
path = path * np.exp(r)
sims[s,t,:] = path
return mean_fc_df, sims
def summarize_mc_paths(sims, index, columns):
"""Возвращает mean, median, 5%, 95% пути."""
mean = sims.mean(axis=0)
median = np.median(sims, axis=0)
low = np.quantile(sims, 0.05, axis=0)
high = np.quantile(sims, 0.95, axis=0)
return (
pd.DataFrame(mean, index=index, columns=columns),
pd.DataFrame(median, index=index, columns=columns),
pd.DataFrame(low, index=index, columns=columns),
pd.DataFrame(high, index=index, columns=columns),
)
def run_VAR(df_prices, train_ret, test_ret):
# lag selection
sel = VAR(train_ret).select_order(8)
lag = sel.selected_orders.get("aic") or sel.selected_orders.get("bic") or 2
lag = int(lag)
var_model = VAR(train_ret).fit(lag)
# ----------- SUMMARY OUTPUT -----------
print(f"\n===== VAR SUMMARY ({sector_name if 'sector_name' in locals() else ''}) =====")
try:
print(var_model.summary().as_text())
except:
print(var_model.summary())
# --------------------------------------
# forecast returns
h = len(test_ret)
fc_ret = var_model.forecast(train_ret.values[-var_model.k_ar:], steps=h)
fc_ret_df = pd.DataFrame(fc_ret, index=test_ret.index, columns=train_ret.columns)
# prices
fc_price = reconstruct_prices(df_prices.iloc[-h-1], fc_ret_df)
# Monte Carlo
mean_fc_df, sims = monte_carlo_VAR(var_model, df_prices.iloc[-h-1], h, train_ret)
mean_mc, med_mc, low_mc, high_mc = summarize_mc_paths(sims, mean_fc_df.index, mean_fc_df.columns)
return {
"model": var_model,
"forecast_returns": fc_ret_df,
"forecast_prices": fc_price,
"mc_mean": mean_mc,
"mc_low": low_mc,
"mc_high": high_mc,
"mc_sims": sims,
# "lag": lag
}
def run_VECM(df_prices, train_ret, test_ret):
"""
Прогноз VECM с автоматической защитой от ошибок размерности.
Использует только встроенный predict(), без ручной математики alpha*beta.
"""
# уровни (не returns!)
train_lvl = df_prices.loc[train_ret.index]
test_lvl = df_prices.loc[test_ret.index]
# ---------- Johansen тест ----------
try:
joh = coint_johansen(train_lvl, det_order=0, k_ar_diff=1)
except Exception as e:
print("VECM: Johansen failed:", e)
return {"model": None}
trace = joh.lr1
cvt = joh.cvt
r = sum(trace > cvt[:,1]) # rank at 95% level
if r <= 0:
print("VECM: No cointegration (rank = 0). Skipping.")
return {"model": None}
if r >= train_lvl.shape[1]:
print(f"VECM: Invalid rank r={r}, k={train_lvl.shape[1]}. Skipping.")
return {"model": None}
# ---------- Fit VECM ----------
try:
vecm = VECM(train_lvl, k_ar_diff=1, coint_rank=r)
res = vecm.fit()
except Exception as e:
print("VECM fit failed:", e)
return {"model": None}
# ----------- SUMMARY OUTPUT -----------
print(f"\n===== VAR SUMMARY ({sector_name if 'sector_name' in locals() else ''}) =====")
try:
print(res.summary().as_text())
except:
print(res.summary())
# --------------------------------------
# ---------- Forecast ----------
h = len(test_ret)
try:
fc = res.predict(steps=h)
except Exception as e:
print("VECM forecast failed:", e)
return {"model": None}
idx = test_lvl.index
fc_df = pd.DataFrame(fc, index=idx, columns=train_lvl.columns)
# ---------- Monte Carlo ----------
# упрощённый вариант: просто добавляем небольшие shocks
resid = res.resid
Σ = np.cov(resid.T)
sims = np.zeros((200, h, train_lvl.shape[1]))
for s in range(200):
prev = train_lvl.iloc[-1].values.copy()
for t in range(h):
eps = np.random.multivariate_normal(np.zeros(train_lvl.shape[1]), Σ)
prev = fc_df.iloc[t].values + eps
sims[s,t,:] = prev
mean_mc = sims.mean(axis=0)
mean_mc_df = pd.DataFrame(mean_mc, index=idx, columns=train_lvl.columns)
return {
"model": res,
"forecast_prices": fc_df,
"mc_mean": mean_mc_df,
"mc_sims": sims
}
def run_VARMAX(df_prices, train_ret, test_ret):
try:
# простая модель без экзогенов
model = VARMAX(train_ret, order=(1,1))
res = model.fit(disp=False)
except:
return {"model": None}
# ----------- SUMMARY OUTPUT -----------
print(f"\n===== VAR SUMMARY ({sector_name if 'sector_name' in locals() else ''}) =====")
try:
print(res.summary().as_text())
except:
print(res.summary())
# --------------------------------------
h = len(test_ret)
fc_ret = res.forecast(steps=h)
fc_ret.index = test_ret.index
fc_price = reconstruct_prices(df_prices.iloc[-h-1], fc_ret)
# MC: шум = ковариация ошибок
resid = res.resid.dropna()
Σ = np.cov(resid.T)
sims = np.zeros((200, h, train_ret.shape[1]))
for s in range(200):
prev = df_prices.iloc[-h-1].values.copy()
for t in range(h):
eps = np.random.multivariate_normal(np.zeros(train_ret.shape[1]), Σ)
r = fc_ret.iloc[t].values + eps
prev = prev * np.exp(r)
sims[s,t,:] = prev
mean_mc = sims.mean(axis=0)
mean_mc_df = pd.DataFrame(mean_mc, index=test_ret.index, columns=train_ret.columns)
return {
"model": res,
"forecast_prices": fc_price,
"mc_mean": mean_mc_df,
"mc_sims": sims
}
def compare_models(sector, df_prices, holdout_days=60):
print(f"\n=== {sector.upper()} ===")
df_prices = df_prices.apply(pd.to_numeric, errors="coerce").dropna()
df_ret = compute_returns(df_prices)
train = df_ret.iloc[:-holdout_days]
test = df_ret.iloc[-holdout_days:]
# ---------------- VAR ----------------
VAR_res = run_VAR(df_prices, train, test)
# ---------------- VECM ----------------
VECM_res = run_VECM(df_prices, train, test)
# ---------------- VARMAX ----------------
VARMAX_res = run_VARMAX(df_prices, train, test)
# ---------------- ERROR METRICS ----------------
metrics = {}
# VAR
if VAR_res.get("forecast_prices") is not None:
y_true = df_prices.loc[VAR_res["forecast_prices"].index]
y_pred = VAR_res["forecast_prices"]
metrics["VAR_RMSE"] = rmse(y_true, y_pred)
metrics["VAR_MAE"] = mae(y_true, y_pred)
else:
metrics["VAR_RMSE"] = None
metrics["VAR_MAE"] = None
# VECM
if VECM_res.get("forecast_prices") is not None:
y_true = df_prices.loc[VECM_res["forecast_prices"].index]
y_pred = VECM_res["forecast_prices"]
metrics["VECM_RMSE"] = rmse(y_true, y_pred)
metrics["VECM_MAE"] = mae(y_true, y_pred)
else:
metrics["VECM_RMSE"] = None
metrics["VECM_MAE"] = None
# VARMAX
if VARMAX_res.get("forecast_prices") is not None:
y_true = df_prices.loc[VARMAX_res["forecast_prices"].index]
y_pred = VARMAX_res["forecast_prices"]
metrics["VARMAX_RMSE"] = rmse(y_true, y_pred)
metrics["VARMAX_MAE"] = mae(y_true, y_pred)
else:
metrics["VARMAX_RMSE"] = None
metrics["VARMAX_MAE"] = None
return {
"VAR": VAR_res,
"VECM": VECM_res,
"VARMAX": VARMAX_res,
"metrics": metrics
}
Делаем прогнозы и строим графики
results = {}
for sector, file in SECTORS.items():
df = pd.read_csv(file, index_col=0, parse_dates=True)
results[sector] = compare_models(sector, df, holdout_days=60)
=== RUS_FIN ===
===== VAR SUMMARY () =====
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:09:15
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -23.0519
Nobs: 364.000 HQIC: -23.3035
Log likelihood: 2760.97 FPE: 6.41754e-11
AIC: -23.4695 Det(Omega_mle): 5.77628e-11
--------------------------------------------------------------------
Results for equation SBER.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const -0.000953 0.001161 -0.821 0.412
L1.SBER.ME -0.120067 0.086005 -1.396 0.163
L1.VTBR.ME 0.118705 0.091821 1.293 0.196
L1.TCSG.ME -0.010379 0.042614 -0.244 0.808
L2.SBER.ME -0.053712 0.085561 -0.628 0.530
L2.VTBR.ME -0.110885 0.090613 -1.224 0.221
L2.TCSG.ME 0.051327 0.042311 1.213 0.225
L3.SBER.ME -0.125822 0.085604 -1.470 0.142
L3.VTBR.ME -0.034478 0.090393 -0.381 0.703
L3.TCSG.ME 0.167684 0.042923 3.907 0.000
L4.SBER.ME 0.227856 0.084188 2.707 0.007
L4.VTBR.ME -0.166232 0.089140 -1.865 0.062
L4.TCSG.ME -0.010293 0.044614 -0.231 0.818
=============================================================================
Results for equation VTBR.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const -0.001037 0.001032 -1.005 0.315
L1.SBER.ME -0.054795 0.076427 -0.717 0.473
L1.VTBR.ME 0.195459 0.081596 2.395 0.017
L1.TCSG.ME 0.008296 0.037868 0.219 0.827
L2.SBER.ME 0.014093 0.076033 0.185 0.853
L2.VTBR.ME -0.053074 0.080522 -0.659 0.510
L2.TCSG.ME -0.013821 0.037599 -0.368 0.713
L3.SBER.ME 0.009426 0.076071 0.124 0.901
L3.VTBR.ME -0.078054 0.080327 -0.972 0.331
L3.TCSG.ME 0.187902 0.038143 4.926 0.000
L4.SBER.ME 0.105043 0.074813 1.404 0.160
L4.VTBR.ME -0.076017 0.079214 -0.960 0.337
L4.TCSG.ME 0.046019 0.039646 1.161 0.246
=============================================================================
Results for equation TCSG.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const 0.001068 0.001729 0.618 0.537
L1.SBER.ME 0.014842 0.128082 0.116 0.908
L1.VTBR.ME 0.143472 0.136744 1.049 0.294
L1.TCSG.ME -0.036270 0.063463 -0.572 0.568
L2.SBER.ME -0.063421 0.127421 -0.498 0.619
L2.VTBR.ME -0.101289 0.134945 -0.751 0.453
L2.TCSG.ME -0.019129 0.063011 -0.304 0.761
L3.SBER.ME -0.233042 0.127485 -1.828 0.068
L3.VTBR.ME 0.114846 0.134617 0.853 0.394
L3.TCSG.ME 0.254861 0.063923 3.987 0.000
L4.SBER.ME 0.288242 0.125377 2.299 0.022
L4.VTBR.ME -0.074529 0.132752 -0.561 0.575
L4.TCSG.ME -0.075576 0.066441 -1.137 0.255
=============================================================================
Correlation matrix of residuals
SBER.ME VTBR.ME TCSG.ME
SBER.ME 1.000000 0.758710 0.539431
VTBR.ME 0.758710 1.000000 0.471629
TCSG.ME 0.539431 0.471629 1.000000
===== VAR SUMMARY () =====
Det. terms outside the coint. relation & lagged endog. parameters for equation SBER.ME
===================================================================================
coef std err z P>|z| [0.025 0.975]
-----------------------------------------------------------------------------------
L1.SBER.ME.SBER -0.0022 0.092 -0.024 0.981 -0.183 0.179
L1.VTBR.ME.SBER -0.0172 0.023 -0.752 0.452 -0.062 0.028
L1.TCSG.ME.SBER -0.0026 0.004 -0.705 0.481 -0.010 0.005
Det. terms outside the coint. relation & lagged endog. parameters for equation VTBR.ME
===================================================================================
coef std err z P>|z| [0.025 0.975]
-----------------------------------------------------------------------------------
L1.SBER.ME.VTBR -0.0359 0.368 -0.098 0.922 -0.758 0.686
L1.VTBR.ME.VTBR 0.1341 0.091 1.473 0.141 -0.044 0.312
L1.TCSG.ME.VTBR 0.0158 0.015 1.071 0.284 -0.013 0.045
Det. terms outside the coint. relation & lagged endog. parameters for equation TCSG.ME
===================================================================================
coef std err z P>|z| [0.025 0.975]
-----------------------------------------------------------------------------------
L1.SBER.ME.TCSG -0.9703 3.163 -0.307 0.759 -7.170 5.230
L1.VTBR.ME.TCSG -0.0012 0.781 -0.002 0.999 -1.533 1.530
L1.TCSG.ME.TCSG 0.0485 0.126 0.384 0.701 -0.199 0.296
Loading coefficients (alpha) for equation SBER.ME
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ec1.SBER 0.0009 0.035 0.027 0.978 -0.067 0.069
Loading coefficients (alpha) for equation VTBR.ME
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ec1.VTBR 0.1531 0.138 1.107 0.268 -0.118 0.424
Loading coefficients (alpha) for equation TCSG.ME
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ec1.TCSG 1.7600 1.187 1.483 0.138 -0.566 4.086
Cointegration relations for loading-coefficients-column 1
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
beta.1 1.0000 0 0 0.000 1.000 1.000
beta.2 -0.1981 0.004 -44.854 0.000 -0.207 -0.189
beta.3 -0.0480 0.001 -91.424 0.000 -0.049 -0.047
==============================================================================
===== VAR SUMMARY () =====
Statespace Model Results
=============================================================================================
Dep. Variable: ['SBER.ME', 'VTBR.ME', 'TCSG.ME'] No. Observations: 368
Model: VARMA(1,1) Log Likelihood 2755.593
+ intercept AIC -5457.186
Date: Sun, 23 Nov 2025 BIC -5351.668
Time: 21:11:00 HQIC -5415.265
Sample: 0
- 368
Covariance Type: opg
========================================================================================
Ljung-Box (L1) (Q): 0.02, 0.00, 0.01 Jarque-Bera (JB): 387.28, 448.60, 3277.85
Prob(Q): 0.89, 0.99, 0.92 Prob(JB): 0.00, 0.00, 0.00
Heteroskedasticity (H): 0.81, 0.77, 0.59 Skew: -0.21, -0.35, -1.73
Prob(H) (two-sided): 0.25, 0.15, 0.00 Kurtosis: 8.01, 8.36, 17.20
Results for equation SBER.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.SBER -0.0006 0.002 -0.326 0.744 -0.004 0.003
L1.SBER.ME.SBER -0.0274 0.644 -0.042 0.966 -1.290 1.235
L1.VTBR.ME.SBER 0.0185 0.465 0.040 0.968 -0.893 0.930
L1.TCSG.ME.SBER -0.0396 0.606 -0.065 0.948 -1.227 1.148
L1.e(SBER.ME).SBER -0.0382 0.651 -0.059 0.953 -1.314 1.238
L1.e(VTBR.ME).SBER 0.0542 0.466 0.116 0.907 -0.858 0.967
L1.e(TCSG.ME).SBER 0.0085 0.608 0.014 0.989 -1.183 1.200
Results for equation VTBR.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.VTBR -0.0007 0.002 -0.405 0.686 -0.004 0.003
L1.SBER.ME.VTBR 0.0072 0.680 0.011 0.992 -1.326 1.340
L1.VTBR.ME.VTBR 0.1495 0.498 0.300 0.764 -0.827 1.126
L1.TCSG.ME.VTBR -0.0084 0.616 -0.014 0.989 -1.216 1.199
L1.e(SBER.ME).VTBR -0.0212 0.690 -0.031 0.976 -1.374 1.332
L1.e(VTBR.ME).VTBR 0.0233 0.519 0.045 0.964 -0.994 1.041
L1.e(TCSG.ME).VTBR 0.0013 0.615 0.002 0.998 -1.205 1.208
Results for equation TCSG.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.TCSG 0.0011 0.003 0.370 0.711 -0.005 0.007
L1.SBER.ME.TCSG 0.1434 1.116 0.129 0.898 -2.044 2.331
L1.VTBR.ME.TCSG -0.0016 0.790 -0.002 0.998 -1.550 1.546
L1.TCSG.ME.TCSG -0.0755 0.870 -0.087 0.931 -1.780 1.629
L1.e(SBER.ME).TCSG -0.0615 1.109 -0.055 0.956 -2.235 2.112
L1.e(VTBR.ME).TCSG 0.0699 0.782 0.089 0.929 -1.462 1.602
L1.e(TCSG.ME).TCSG 0.0056 0.870 0.006 0.995 -1.699 1.710
Error covariance matrix
============================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------------
sqrt.var.SBER.ME 0.0223 0.001 31.555 0.000 0.021 0.024
sqrt.cov.SBER.ME.VTBR.ME 0.0152 0.001 19.924 0.000 0.014 0.017
sqrt.var.VTBR.ME 0.0130 0.000 31.441 0.000 0.012 0.014
sqrt.cov.SBER.ME.TCSG.ME 0.0188 0.002 10.547 0.000 0.015 0.022
sqrt.cov.VTBR.ME.TCSG.ME 0.0037 0.002 2.004 0.045 8.17e-05 0.007
sqrt.var.TCSG.ME 0.0273 0.001 42.064 0.000 0.026 0.029
============================================================================================
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
=== RUS_OIL ===
===== VAR SUMMARY () =====
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:11:03
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -25.5055
Nobs: 1554.00 HQIC: -25.5315
Log likelihood: 13246.8 FPE: 8.03825e-12
AIC: -25.5468 Det(Omega_mle): 7.97649e-12
--------------------------------------------------------------------
Results for equation GAZP.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const 0.000693 0.000395 1.755 0.079
L1.GAZP.ME 0.087672 0.031871 2.751 0.006
L1.LKOH.ME -0.024068 0.031368 -0.767 0.443
L1.ROSN.ME -0.007666 0.032124 -0.239 0.811
=============================================================================
Results for equation LKOH.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const 0.000908 0.000452 2.008 0.045
L1.GAZP.ME -0.004056 0.036521 -0.111 0.912
L1.LKOH.ME -0.036476 0.035944 -1.015 0.310
L1.ROSN.ME 0.062906 0.036811 1.709 0.087
=============================================================================
Results for equation ROSN.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const 0.000559 0.000447 1.251 0.211
L1.GAZP.ME 0.007334 0.036067 0.203 0.839
L1.LKOH.ME 0.000547 0.035497 0.015 0.988
L1.ROSN.ME 0.107910 0.036353 2.968 0.003
=============================================================================
Correlation matrix of residuals
GAZP.ME LKOH.ME ROSN.ME
GAZP.ME 1.000000 0.544161 0.564897
LKOH.ME 0.544161 1.000000 0.684490
ROSN.ME 0.564897 0.684490 1.000000
VECM: No cointegration (rank = 0). Skipping.
===== VAR SUMMARY () =====
Statespace Model Results
=============================================================================================
Dep. Variable: ['GAZP.ME', 'LKOH.ME', 'ROSN.ME'] No. Observations: 1555
Model: VARMA(1,1) Log Likelihood 13251.945
+ intercept AIC -26449.890
Date: Sun, 23 Nov 2025 BIC -26305.461
Time: 21:12:53 HQIC -26396.182
Sample: 0
- 1555
Covariance Type: opg
==========================================================================================
Ljung-Box (L1) (Q): 0.01, 0.00, 0.00 Jarque-Bera (JB): 5033.88, 12306.17, 595.48
Prob(Q): 0.91, 1.00, 0.98 Prob(JB): 0.00, 0.00, 0.00
Heteroskedasticity (H): 1.76, 2.02, 0.98 Skew: 0.77, -0.85, 0.04
Prob(H) (two-sided): 0.00, 0.00, 0.78 Kurtosis: 11.68, 16.68, 6.03
Results for equation GAZP.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.GAZP 0.0007 0.001 0.927 0.354 -0.001 0.002
L1.GAZP.ME.GAZP 0.0877 0.351 0.250 0.802 -0.599 0.775
L1.LKOH.ME.GAZP -0.0241 0.757 -0.032 0.975 -1.509 1.461
L1.ROSN.ME.GAZP -0.0077 0.371 -0.021 0.984 -0.735 0.720
L1.e(GAZP.ME).GAZP -0.0001 0.354 -0.000 1.000 -0.693 0.693
L1.e(LKOH.ME).GAZP 0.0035 0.755 0.005 0.996 -1.476 1.483
L1.e(ROSN.ME).GAZP 0.0011 0.374 0.003 0.998 -0.732 0.734
Results for equation LKOH.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.LKOH 0.0009 0.001 1.074 0.283 -0.001 0.003
L1.GAZP.ME.LKOH -0.0038 0.388 -0.010 0.992 -0.765 0.758
L1.LKOH.ME.LKOH -0.0364 0.813 -0.045 0.964 -1.630 1.558
L1.ROSN.ME.LKOH 0.0629 0.328 0.192 0.848 -0.579 0.705
L1.e(GAZP.ME).LKOH -0.0032 0.392 -0.008 0.993 -0.772 0.765
L1.e(LKOH.ME).LKOH -0.0017 0.818 -0.002 0.998 -1.606 1.602
L1.e(ROSN.ME).LKOH 0.0083 0.336 0.025 0.980 -0.650 0.667
Results for equation ROSN.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.ROSN 0.0006 0.001 0.665 0.506 -0.001 0.002
L1.GAZP.ME.ROSN 0.0073 0.433 0.017 0.986 -0.841 0.856
L1.LKOH.ME.ROSN 0.0008 0.717 0.001 0.999 -1.404 1.406
L1.ROSN.ME.ROSN 0.1078 0.273 0.395 0.693 -0.427 0.642
L1.e(GAZP.ME).ROSN 0.0018 0.433 0.004 0.997 -0.847 0.851
L1.e(LKOH.ME).ROSN 0.0028 0.728 0.004 0.997 -1.424 1.429
L1.e(ROSN.ME).ROSN -0.0001 0.286 -0.001 1.000 -0.560 0.560
Error covariance matrix
============================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------------
sqrt.var.GAZP.ME 0.0155 0.000 72.647 0.000 0.015 0.016
sqrt.cov.GAZP.ME.LKOH.ME 0.0097 0.000 32.821 0.000 0.009 0.010
sqrt.var.LKOH.ME 0.0149 0.000 92.313 0.000 0.015 0.015
sqrt.cov.GAZP.ME.ROSN.ME 0.0099 0.000 29.383 0.000 0.009 0.011
sqrt.cov.LKOH.ME.ROSN.ME 0.0079 0.000 33.171 0.000 0.007 0.008
sqrt.var.ROSN.ME 0.0122 0.000 72.829 0.000 0.012 0.012
============================================================================================
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
=== RUS_MET ===
===== VAR SUMMARY () =====
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:12:57
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -25.1237
Nobs: 1553.00 HQIC: -25.1691
Log likelihood: 12974.9 FPE: 1.14161e-11
AIC: -25.1960 Det(Omega_mle): 1.12631e-11
--------------------------------------------------------------------
Results for equation NLMK.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const 0.001232 0.000441 2.797 0.005
L1.NLMK.ME -0.092896 0.032797 -2.832 0.005
L1.GMKN.ME 0.034962 0.027295 1.281 0.200
L1.CHMF.ME 0.084534 0.037290 2.267 0.023
L2.NLMK.ME -0.081030 0.032754 -2.474 0.013
L2.GMKN.ME -0.049538 0.027304 -1.814 0.070
L2.CHMF.ME 0.088404 0.037339 2.368 0.018
=============================================================================
Results for equation GMKN.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const 0.000524 0.000445 1.180 0.238
L1.NLMK.ME 0.012946 0.033096 0.391 0.696
L1.GMKN.ME 0.042117 0.027544 1.529 0.126
L1.CHMF.ME 0.011866 0.037629 0.315 0.753
L2.NLMK.ME 0.009817 0.033053 0.297 0.766
L2.GMKN.ME -0.039782 0.027553 -1.444 0.149
L2.CHMF.ME -0.009004 0.037679 -0.239 0.811
=============================================================================
Results for equation CHMF.ME
=============================================================================
coefficient std. error t-stat prob
-----------------------------------------------------------------------------
const 0.001028 0.000396 2.596 0.009
L1.NLMK.ME 0.009470 0.029492 0.321 0.748
L1.GMKN.ME 0.036346 0.024544 1.481 0.139
L1.CHMF.ME -0.063767 0.033532 -1.902 0.057
L2.NLMK.ME -0.000296 0.029454 -0.010 0.992
L2.GMKN.ME -0.023810 0.024552 -0.970 0.332
L2.CHMF.ME -0.016002 0.033576 -0.477 0.634
=============================================================================
Correlation matrix of residuals
NLMK.ME GMKN.ME CHMF.ME
NLMK.ME 1.000000 0.318221 0.629246
GMKN.ME 0.318221 1.000000 0.368201
CHMF.ME 0.629246 0.368201 1.000000
VECM: No cointegration (rank = 0). Skipping.
===== VAR SUMMARY () =====
Statespace Model Results
=============================================================================================
Dep. Variable: ['NLMK.ME', 'GMKN.ME', 'CHMF.ME'] No. Observations: 1555
Model: VARMA(1,1) Log Likelihood 12981.996
+ intercept AIC -25909.992
Date: Sun, 23 Nov 2025 BIC -25765.563
Time: 21:13:59 HQIC -25856.284
Sample: 0
- 1555
Covariance Type: opg
========================================================================================
Ljung-Box (L1) (Q): 0.00, 0.00, 0.00 Jarque-Bera (JB): 125.80, 4928.64, 226.91
Prob(Q): 0.98, 1.00, 0.99 Prob(JB): 0.00, 0.00, 0.00
Heteroskedasticity (H): 0.95, 1.52, 0.47 Skew: 0.07, -0.43, -0.03
Prob(H) (two-sided): 0.56, 0.00, 0.00 Kurtosis: 4.39, 11.68, 4.87
Results for equation NLMK.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.NLMK 0.0012 0.001 1.538 0.124 -0.000 0.003
L1.NLMK.ME.NLMK -0.0854 0.310 -0.275 0.783 -0.693 0.522
L1.GMKN.ME.NLMK 0.0339 0.485 0.070 0.944 -0.916 0.984
L1.CHMF.ME.NLMK 0.0770 0.492 0.157 0.876 -0.887 1.041
L1.e(NLMK.ME).NLMK -0.0073 0.314 -0.023 0.981 -0.622 0.607
L1.e(GMKN.ME).NLMK 0.0010 0.486 0.002 0.998 -0.951 0.953
L1.e(CHMF.ME).NLMK 0.0076 0.494 0.015 0.988 -0.961 0.976
Results for equation GMKN.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.GMKN 0.0005 0.001 0.645 0.519 -0.001 0.002
L1.NLMK.ME.GMKN 0.0103 0.311 0.033 0.973 -0.598 0.619
L1.GMKN.ME.GMKN 0.0410 0.472 0.087 0.931 -0.883 0.965
L1.CHMF.ME.GMKN 0.0136 0.484 0.028 0.978 -0.936 0.963
L1.e(NLMK.ME).GMKN 0.0027 0.311 0.009 0.993 -0.608 0.613
L1.e(GMKN.ME).GMKN 0.0011 0.470 0.002 0.998 -0.921 0.923
L1.e(CHMF.ME).GMKN -0.0017 0.490 -0.004 0.997 -0.962 0.958
Results for equation CHMF.ME
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
intercept.CHMF 0.0010 0.001 1.395 0.163 -0.000 0.002
L1.NLMK.ME.CHMF 0.0066 0.280 0.024 0.981 -0.542 0.555
L1.GMKN.ME.CHMF 0.0359 0.449 0.080 0.936 -0.845 0.916
L1.CHMF.ME.CHMF -0.0610 0.450 -0.136 0.892 -0.942 0.820
L1.e(NLMK.ME).CHMF 0.0029 0.280 0.011 0.992 -0.545 0.551
L1.e(GMKN.ME).CHMF 0.0004 0.449 0.001 0.999 -0.880 0.881
L1.e(CHMF.ME).CHMF -0.0027 0.453 -0.006 0.995 -0.891 0.885
Error covariance matrix
============================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------------
sqrt.var.NLMK.ME 0.0173 0.000 66.616 0.000 0.017 0.018
sqrt.cov.NLMK.ME.GMKN.ME 0.0055 0.000 17.215 0.000 0.005 0.006
sqrt.var.GMKN.ME 0.0165 0.000 102.208 0.000 0.016 0.017
sqrt.cov.NLMK.ME.CHMF.ME 0.0098 0.000 30.911 0.000 0.009 0.010
sqrt.cov.GMKN.ME.CHMF.ME 0.0028 0.000 9.734 0.000 0.002 0.003
sqrt.var.CHMF.ME 0.0117 0.000 74.064 0.000 0.011 0.012
============================================================================================
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
=== US_TECH ===
===== VAR SUMMARY () =====
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:14:03
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -24.2448
Nobs: 2452.00 HQIC: -24.2629
Log likelihood: 19333.3 FPE: 2.87258e-11
AIC: -24.2732 Det(Omega_mle): 2.85856e-11
--------------------------------------------------------------------
Results for equation AAPL
==========================================================================
coefficient std. error t-stat prob
--------------------------------------------------------------------------
const 0.000952 0.000371 2.564 0.010
L1.AAPL -0.001404 0.028333 -0.050 0.960
L1.MSFT -0.118671 0.032959 -3.601 0.000
L1.NVDA 0.017941 0.015503 1.157 0.247
==========================================================================
Results for equation MSFT
==========================================================================
coefficient std. error t-stat prob
--------------------------------------------------------------------------
const 0.001089 0.000339 3.210 0.001
L1.AAPL -0.063696 0.025868 -2.462 0.014
L1.MSFT -0.147771 0.030093 -4.911 0.000
L1.NVDA 0.033094 0.014154 2.338 0.019
==========================================================================
Results for equation NVDA
==========================================================================
coefficient std. error t-stat prob
--------------------------------------------------------------------------
const 0.002448 0.000631 3.877 0.000
L1.AAPL -0.051237 0.048164 -1.064 0.287
L1.MSFT -0.172090 0.056030 -3.071 0.002
L1.NVDA -0.001919 0.026354 -0.073 0.942
==========================================================================
Correlation matrix of residuals
AAPL MSFT NVDA
AAPL 1.000000 0.685036 0.545302
MSFT 0.685036 1.000000 0.621345
NVDA 0.545302 0.621345 1.000000
===== VAR SUMMARY () =====
Det. terms outside the coint. relation & lagged endog. parameters for equation AAPL
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
L1.AAPL 0.0627 0.026 2.445 0.014 0.012 0.113
L1.MSFT -0.0439 0.016 -2.786 0.005 -0.075 -0.013
L1.NVDA -0.0453 0.031 -1.443 0.149 -0.107 0.016
Det. terms outside the coint. relation & lagged endog. parameters for equation MSFT
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
L1.AAPL -0.0934 0.044 -2.114 0.035 -0.180 -0.007
L1.MSFT -0.0771 0.027 -2.837 0.005 -0.130 -0.024
L1.NVDA 0.1349 0.054 2.494 0.013 0.029 0.241
Det. terms outside the coint. relation & lagged endog. parameters for equation NVDA
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
L1.AAPL 0.0211 0.018 1.146 0.252 -0.015 0.057
L1.MSFT -0.0182 0.011 -1.607 0.108 -0.040 0.004
L1.NVDA -0.0972 0.023 -4.308 0.000 -0.141 -0.053
Loading coefficients (alpha) for equation AAPL
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ec1 -0.0133 0.003 -4.964 0.000 -0.018 -0.008
Loading coefficients (alpha) for equation MSFT
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ec1 -0.0143 0.005 -3.099 0.002 -0.023 -0.005
Loading coefficients (alpha) for equation NVDA
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
ec1 -0.0107 0.002 -5.586 0.000 -0.014 -0.007
Cointegration relations for loading-coefficients-column 1
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
beta.1 1.0000 0 0 0.000 1.000 1.000
beta.2 -0.5593 0.020 -27.539 0.000 -0.599 -0.519
beta.3 0.0659 0.096 0.689 0.491 -0.122 0.253
==============================================================================
===== VAR SUMMARY () =====
Statespace Model Results
====================================================================================
Dep. Variable: ['AAPL', 'MSFT', 'NVDA'] No. Observations: 2453
Model: VARMA(1,1) Log Likelihood 19342.473
+ intercept AIC -38630.946
Date: Sun, 23 Nov 2025 BIC -38474.209
Time: 21:16:17 HQIC -38573.988
Sample: 0
- 2453
Covariance Type: opg
===========================================================================================
Ljung-Box (L1) (Q): 0.00, 0.00, 0.00 Jarque-Bera (JB): 3850.32, 1687.74, 18448.11
Prob(Q): 0.99, 1.00, 1.00 Prob(JB): 0.00, 0.00, 0.00
Heteroskedasticity (H): 1.34, 1.34, 1.10 Skew: -0.10, -0.14, 0.51
Prob(H) (two-sided): 0.00, 0.00, 0.16 Kurtosis: 9.13, 7.05, 16.40
Results for equation AAPL
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
intercept 0.0009 0.001 1.695 0.090 -0.000 0.002
L1.AAPL -0.0014 0.678 -0.002 0.998 -1.331 1.328
L1.MSFT -0.1187 0.529 -0.224 0.822 -1.156 0.918
L1.NVDA 0.0179 0.339 0.053 0.958 -0.646 0.682
L1.e(AAPL) -0.0005 0.678 -0.001 0.999 -1.330 1.329
L1.e(MSFT) -0.0052 0.531 -0.010 0.992 -1.046 1.036
L1.e(NVDA) 0.0016 0.337 0.005 0.996 -0.659 0.662
Results for equation MSFT
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
intercept 0.0011 0.001 2.095 0.036 7e-05 0.002
L1.AAPL -0.0637 0.684 -0.093 0.926 -1.405 1.278
L1.MSFT -0.1478 0.532 -0.278 0.781 -1.191 0.895
L1.NVDA 0.0331 0.339 0.098 0.922 -0.631 0.697
L1.e(AAPL) 0.0012 0.684 0.002 0.999 -1.340 1.342
L1.e(MSFT) -0.0068 0.534 -0.013 0.990 -1.053 1.039
L1.e(NVDA) 0.0006 0.338 0.002 0.999 -0.662 0.663
Results for equation NVDA
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
intercept 0.0024 0.001 2.384 0.017 0.000 0.004
L1.AAPL -0.0512 1.404 -0.036 0.971 -2.803 2.700
L1.MSFT -0.1722 0.974 -0.177 0.860 -2.081 1.736
L1.NVDA -0.0019 0.597 -0.003 0.997 -1.173 1.169
L1.e(AAPL) -0.0002 1.406 -0.000 1.000 -2.755 2.755
L1.e(MSFT) -0.0050 0.977 -0.005 0.996 -1.920 1.910
L1.e(NVDA) 0.0023 0.597 0.004 0.997 -1.169 1.173
Error covariance matrix
======================================================================================
coef std err z P>|z| [0.025 0.975]
--------------------------------------------------------------------------------------
sqrt.var.AAPL 0.0183 0.000 120.100 0.000 0.018 0.019
sqrt.cov.AAPL.MSFT 0.0115 0.000 62.429 0.000 0.011 0.012
sqrt.var.MSFT 0.0122 0.000 105.752 0.000 0.012 0.012
sqrt.cov.AAPL.NVDA 0.0170 0.001 33.795 0.000 0.016 0.018
sqrt.cov.MSFT.NVDA 0.0106 0.000 23.702 0.000 0.010 0.011
sqrt.var.NVDA 0.0239 0.000 174.496 0.000 0.024 0.024
======================================================================================
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
=== US_RETAIL ===
===== VAR SUMMARY () =====
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:16:18
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -25.4581
Nobs: 2451.00 HQIC: -25.4898
Log likelihood: 20847.4 FPE: 8.35753e-12
AIC: -25.5079 Det(Omega_mle): 8.28633e-12
--------------------------------------------------------------------
Results for equation WMT
==========================================================================
coefficient std. error t-stat prob
--------------------------------------------------------------------------
const 0.000751 0.000274 2.735 0.006
L1.WMT -0.040319 0.025120 -1.605 0.108
L1.COST 0.000445 0.024734 0.018 0.986
L1.TGT -0.029625 0.014751 -2.008 0.045
L2.WMT 0.023747 0.025070 0.947 0.344
L2.COST -0.019437 0.024728 -0.786 0.432
L2.TGT -0.005943 0.014765 -0.403 0.687
==========================================================================
Results for equation COST
==========================================================================
coefficient std. error t-stat prob
--------------------------------------------------------------------------
const 0.000797 0.000283 2.820 0.005
L1.WMT 0.003651 0.025876 0.141 0.888
L1.COST -0.033236 0.025478 -1.304 0.192
L1.TGT -0.001459 0.015195 -0.096 0.924
L2.WMT 0.058651 0.025824 2.271 0.023
L2.COST -0.029069 0.025472 -1.141 0.254
L2.TGT -0.008328 0.015210 -0.548 0.584
==========================================================================
Results for equation TGT
==========================================================================
coefficient std. error t-stat prob
--------------------------------------------------------------------------
const 0.000188 0.000427 0.439 0.660
L1.WMT 0.014709 0.039130 0.376 0.707
L1.COST -0.027493 0.038529 -0.714 0.475
L1.TGT -0.016082 0.022978 -0.700 0.484
L2.WMT 0.081409 0.039052 2.085 0.037
L2.COST 0.016801 0.038520 0.436 0.663
L2.TGT -0.039597 0.023000 -1.722 0.085
==========================================================================
Correlation matrix of residuals
WMT COST TGT
WMT 1.000000 0.565825 0.406222
COST 0.565825 1.000000 0.434701
TGT 0.406222 0.434701 1.000000
VECM: No cointegration (rank = 0). Skipping.
===== VAR SUMMARY () =====
Statespace Model Results
==================================================================================
Dep. Variable: ['WMT', 'COST', 'TGT'] No. Observations: 2453
Model: VARMA(1,1) Log Likelihood 20860.977
+ intercept AIC -41667.954
Date: Sun, 23 Nov 2025 BIC -41511.218
Time: 21:16:45 HQIC -41610.997
Sample: 0
- 2453
Covariance Type: opg
============================================================================================
Ljung-Box (L1) (Q): 0.00, 0.00, 0.00 Jarque-Bera (JB): 19532.40, 4927.55, 85294.34
Prob(Q): 1.00, 1.00, 1.00 Prob(JB): 0.00, 0.00, 0.00
Heteroskedasticity (H): 1.06, 1.02, 2.13 Skew: -0.09, -0.59, -0.85
Prob(H) (two-sided): 0.41, 0.78, 0.00 Kurtosis: 16.82, 9.84, 31.84
Results for equation WMT
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
intercept 0.0007 0.001 1.109 0.267 -0.001 0.002
L1.WMT -0.0407 0.309 -0.132 0.895 -0.646 0.564
L1.COST 0.0007 1.084 0.001 0.999 -2.124 2.125
L1.TGT -0.0293 0.997 -0.029 0.977 -1.983 1.924
L1.e(WMT) 0.0004 0.311 0.001 0.999 -0.610 0.611
L1.e(COST) -0.0003 1.084 -0.000 1.000 -2.125 2.124
L1.e(TGT) -0.0003 0.996 -0.000 1.000 -1.953 1.952
Results for equation COST
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
intercept 0.0008 0.001 1.114 0.265 -0.001 0.002
L1.WMT 0.0013 0.366 0.004 0.997 -0.716 0.718
L1.COST -0.0322 1.137 -0.028 0.977 -2.261 2.197
L1.TGT -0.0009 1.152 -0.001 0.999 -2.260 2.258
L1.e(WMT) 0.0024 0.367 0.006 0.995 -0.718 0.723
L1.e(COST) -0.0010 1.138 -0.001 0.999 -2.232 2.230
L1.e(TGT) -0.0006 1.152 -0.000 1.000 -2.259 2.258
Results for equation TGT
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
intercept 0.0003 0.001 0.217 0.828 -0.002 0.003
L1.WMT 0.0128 0.566 0.023 0.982 -1.097 1.122
L1.COST -0.0281 1.762 -0.016 0.987 -3.482 3.426
L1.TGT -0.0153 1.651 -0.009 0.993 -3.251 3.221
L1.e(WMT) 0.0019 0.572 0.003 0.997 -1.119 1.123
L1.e(COST) 0.0006 1.760 0.000 1.000 -3.449 3.450
L1.e(TGT) -0.0008 1.654 -0.000 1.000 -3.243 3.241
Error covariance matrix
=====================================================================================
coef std err z P>|z| [0.025 0.975]
-------------------------------------------------------------------------------------
sqrt.var.WMT 0.0135 9.47e-05 142.533 0.000 0.013 0.014
sqrt.cov.WMT.COST 0.0079 0.000 43.864 0.000 0.008 0.008
sqrt.var.COST 0.0115 9.73e-05 117.870 0.000 0.011 0.012
sqrt.cov.WMT.TGT 0.0086 0.000 28.496 0.000 0.008 0.009
sqrt.cov.COST.TGT 0.0053 0.000 15.664 0.000 0.005 0.006
sqrt.var.TGT 0.0185 8.64e-05 214.489 0.000 0.018 0.019
=====================================================================================
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
=== US_ENERGY ===
===== VAR SUMMARY () =====
Summary of Regression Results
==================================
Model: VAR
Method: OLS
Date: Sun, 23, Nov, 2025
Time: 21:16:48
--------------------------------------------------------------------
No. of Equations: 3.00000 BIC: -25.7287
Nobs: 2446.00 HQIC: -25.8284
Log likelihood: 21311.5 FPE: 5.73029e-12
AIC: -25.8853 Det(Omega_mle): 5.57841e-12
--------------------------------------------------------------------
Results for equation XOM
=========================================================================
coefficient std. error t-stat prob
-------------------------------------------------------------------------
const 0.000343 0.000354 0.969 0.333
L1.XOM 0.113682 0.038986 2.916 0.004
L1.CVX -0.136966 0.038475 -3.560 0.000
L1.COP -0.011474 0.026650 -0.431 0.667
L2.XOM -0.094745 0.039014 -2.428 0.015
L2.CVX 0.072750 0.038507 1.889 0.059
L2.COP 0.039540 0.026648 1.484 0.138
L3.XOM -0.099360 0.039113 -2.540 0.011
L3.CVX -0.015058 0.038534 -0.391 0.696
L3.COP 0.070318 0.026629 2.641 0.008
L4.XOM 0.047236 0.039216 1.204 0.228
L4.CVX -0.121018 0.038534 -3.141 0.002
L4.COP 0.068288 0.026642 2.563 0.010
L5.XOM 0.045312 0.039156 1.157 0.247
L5.CVX -0.099727 0.038626 -2.582 0.010
L5.COP 0.071352 0.026676 2.675 0.007
L6.XOM 0.044985 0.039212 1.147 0.251
L6.CVX -0.030967 0.038559 -0.803 0.422
L6.COP -0.032552 0.026672 -1.220 0.222
L7.XOM -0.064725 0.039205 -1.651 0.099
L7.CVX 0.049542 0.038646 1.282 0.200
L7.COP 0.054929 0.026685 2.058 0.040
=========================================================================
Results for equation CVX
=========================================================================
coefficient std. error t-stat prob
-------------------------------------------------------------------------
const 0.000415 0.000369 1.124 0.261
L1.XOM 0.093206 0.040656 2.293 0.022
L1.CVX -0.162794 0.040123 -4.057 0.000
L1.COP 0.007803 0.027792 0.281 0.779
L2.XOM -0.114049 0.040686 -2.803 0.005
L2.CVX 0.108657 0.040157 2.706 0.007
L2.COP 0.039981 0.027790 1.439 0.150
L3.XOM -0.017997 0.040789 -0.441 0.659
L3.CVX -0.060210 0.040185 -1.498 0.134
L3.COP 0.053986 0.027770 1.944 0.052
L4.XOM 0.080533 0.040897 1.969 0.049
L4.CVX -0.162699 0.040185 -4.049 0.000
L4.COP 0.070170 0.027784 2.526 0.012
L5.XOM 0.019586 0.040833 0.480 0.631
L5.CVX -0.113299 0.040281 -2.813 0.005
L5.COP 0.111221 0.027819 3.998 0.000
L6.XOM 0.041323 0.040892 1.011 0.312
L6.CVX -0.102965 0.040211 -2.561 0.010
L6.COP 0.005072 0.027815 0.182 0.855
L7.XOM -0.073690 0.040884 -1.802 0.071
L7.CVX 0.132302 0.040302 3.283 0.001
L7.COP 0.065973 0.027828 2.371 0.018
=========================================================================
Results for equation COP
=========================================================================
coefficient std. error t-stat prob
-------------------------------------------------------------------------
const 0.000416 0.000490 0.848 0.396
L1.XOM 0.163503 0.053967 3.030 0.002
L1.CVX -0.247960 0.053260 -4.656 0.000
L1.COP 0.012411 0.036891 0.336 0.737
L2.XOM -0.032249 0.054007 -0.597 0.550
L2.CVX 0.043107 0.053305 0.809 0.419
L2.COP 0.017356 0.036888 0.471 0.638
L3.XOM -0.077608 0.054143 -1.433 0.152
L3.CVX -0.064440 0.053341 -1.208 0.227
L3.COP 0.105856 0.036862 2.872 0.004
L4.XOM 0.119389 0.054286 2.199 0.028
L4.CVX -0.214488 0.053342 -4.021 0.000
L4.COP 0.064303 0.036881 1.744 0.081
L5.XOM -0.036777 0.054202 -0.679 0.497
L5.CVX -0.054253 0.053469 -1.015 0.310
L5.COP 0.097362 0.036927 2.637 0.008
L6.XOM 0.049533 0.054280 0.913 0.361
L6.CVX -0.044973 0.053376 -0.843 0.399
L6.COP -0.042748 0.036922 -1.158 0.247
L7.XOM -0.079057 0.054270 -1.457 0.145
L7.CVX 0.131870 0.053497 2.465 0.014
L7.COP 0.058099 0.036939 1.573 0.116
=========================================================================
Correlation matrix of residuals
XOM CVX COP
XOM 1.000000 0.830630 0.790408
CVX 0.830630 1.000000 0.806175
COP 0.790408 0.806175 1.000000
VECM: No cointegration (rank = 0). Skipping.
===== VAR SUMMARY () =====
Statespace Model Results
=================================================================================
Dep. Variable: ['XOM', 'CVX', 'COP'] No. Observations: 2453
Model: VARMA(1,1) Log Likelihood 21251.837
+ intercept AIC -42449.674
Date: Sun, 23 Nov 2025 BIC -42292.937
Time: 21:17:53 HQIC -42392.716
Sample: 0
- 2453
Covariance Type: opg
============================================================================================
Ljung-Box (L1) (Q): 0.00, 0.00, 0.00 Jarque-Bera (JB): 3876.01, 102545.79, 3869.97
Prob(Q): 0.98, 0.98, 0.97 Prob(JB): 0.00, 0.00, 0.00
Heteroskedasticity (H): 1.95, 0.87, 0.52 Skew: -0.23, -1.01, -0.09
Prob(H) (two-sided): 0.00, 0.04, 0.00 Kurtosis: 9.14, 34.61, 9.15
Results for equation XOM
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
intercept 0.0003 0.001 0.655 0.512 -0.001 0.001
L1.XOM 0.1238 1.216 0.102 0.919 -2.260 2.507
L1.CVX -0.1414 1.519 -0.093 0.926 -3.119 2.836
L1.COP -0.0192 1.560 -0.012 0.990 -3.077 3.039
L1.e(XOM) 0.0003 1.213 0.000 1.000 -2.376 2.377
L1.e(CVX) 0.0028 1.523 0.002 0.999 -2.982 2.988
L1.e(COP) 0.0038 1.561 0.002 0.998 -3.055 3.063
Results for equation CVX
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
intercept 0.0004 0.001 0.822 0.411 -0.001 0.001
L1.XOM 0.1121 1.147 0.098 0.922 -2.136 2.360
L1.CVX -0.1841 1.327 -0.139 0.890 -2.786 2.418
L1.COP -0.0008 1.437 -0.001 1.000 -2.817 2.816
L1.e(XOM) -0.0010 1.143 -0.001 0.999 -2.242 2.240
L1.e(CVX) 0.0056 1.330 0.004 0.997 -2.602 2.613
L1.e(COP) 0.0052 1.438 0.004 0.997 -2.813 2.824
Results for equation COP
==============================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
intercept 0.0004 0.001 0.595 0.552 -0.001 0.002
L1.XOM 0.1817 1.499 0.121 0.903 -2.755 3.119
L1.CVX -0.2642 1.907 -0.139 0.890 -4.002 3.474
L1.COP 0.0038 1.966 0.002 0.998 -3.850 3.857
L1.e(XOM) -0.0002 1.496 -0.000 1.000 -2.933 2.932
L1.e(CVX) 0.0023 1.916 0.001 0.999 -3.752 3.757
L1.e(COP) 0.0026 1.968 0.001 0.999 -3.855 3.860
Error covariance matrix
====================================================================================
coef std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------------
sqrt.var.XOM 0.0176 0.000 111.287 0.000 0.017 0.018
sqrt.cov.XOM.CVX 0.0155 0.000 71.245 0.000 0.015 0.016
sqrt.var.CVX 0.0104 7.92e-05 131.098 0.000 0.010 0.011
sqrt.cov.XOM.COP 0.0194 0.000 75.469 0.000 0.019 0.020
sqrt.cov.CVX.COP 0.0066 0.000 31.837 0.000 0.006 0.007
sqrt.var.COP 0.0133 0.000 120.622 0.000 0.013 0.014
====================================================================================
Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
import pandas as pd
import matplotlib.pyplot as plt
def plot_error_bars(results):
"""
Строит barplot RMSE/MAE для VAR, VECM, VARMAX для всех секторов.
"""
rows = []
for sector, data in results.items():
m = data["metrics"]
rows.append({
"sector": sector,
"VAR_RMSE": m.get("VAR_RMSE"),
"VAR_MAE": m.get("VAR_MAE"),
"VECM_RMSE": m.get("VECM_RMSE"),
"VECM_MAE": m.get("VECM_MAE"),
"VARMAX_RMSE": m.get("VARMAX_RMSE"),
"VARMAX_MAE": m.get("VARMAX_MAE"),
})
df = pd.DataFrame(rows).set_index("sector")
display(df)
# ---- RMSE barplot ----
plt.figure(figsize=(12,5))
df[[c for c in df.columns if "RMSE" in c]].plot(kind="bar", figsize=(14,5))
plt.title("RMSE comparison per model")
plt.grid(True)
plt.show()
# ---- MAE barplot ----
plt.figure(figsize=(12,5))
df[[c for c in df.columns if "MAE" in c]].plot(kind="bar", figsize=(14,5))
plt.title("MAE comparison per model")
plt.grid(True)
plt.show()
return df
plot_error_bars(results)
| VAR_RMSE | VAR_MAE | VECM_RMSE | VECM_MAE | VARMAX_RMSE | VARMAX_MAE | |
|---|---|---|---|---|---|---|
| sector | ||||||
| rus_fin | 1545.960583 | 802.913731 | 1260.582366 | 655.858409 | 1430.160090 | 737.081858 |
| rus_oil | 1157.329366 | 631.532829 | NaN | NaN | 1159.897044 | 632.803920 |
| rus_met | 912.693196 | 538.815879 | NaN | NaN | 921.420200 | 543.647921 |
| us_tech | 17.023856 | 13.552110 | 13.220559 | 10.396325 | 17.003229 | 13.534082 |
| us_retail | 28.520621 | 17.692414 | NaN | NaN | 28.544865 | 17.717772 |
| us_energy | 6.413615 | 5.110084 | NaN | NaN | 6.274585 | 4.986953 |
<Figure size 1200x500 with 0 Axes>
<Figure size 1200x500 with 0 Axes>
| VAR_RMSE | VAR_MAE | VECM_RMSE | VECM_MAE | VARMAX_RMSE | VARMAX_MAE | |
|---|---|---|---|---|---|---|
| sector | ||||||
| rus_fin | 1545.960583 | 802.913731 | 1260.582366 | 655.858409 | 1430.160090 | 737.081858 |
| rus_oil | 1157.329366 | 631.532829 | NaN | NaN | 1159.897044 | 632.803920 |
| rus_met | 912.693196 | 538.815879 | NaN | NaN | 921.420200 | 543.647921 |
| us_tech | 17.023856 | 13.552110 | 13.220559 | 10.396325 | 17.003229 | 13.534082 |
| us_retail | 28.520621 | 17.692414 | NaN | NaN | 28.544865 | 17.717772 |
| us_energy | 6.413615 | 5.110084 | NaN | NaN | 6.274585 | 4.986953 |
Расширим прогноз.
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.vector_ar.var_model import VAR
from statsmodels.tsa.vector_ar.vecm import VECM, coint_johansen
from statsmodels.tsa.statespace.varmax import VARMAX
# ---------- utilities ----------
def future_index_from_last(last_index, h):
start = last_index[-1] + pd.Timedelta(days=1)
try:
return pd.bdate_range(start=start, periods=h)
except:
return pd.date_range(start=start, periods=h, freq='B')
def safe_reconstruct(last_price, fc_returns_df):
common = [c for c in fc_returns_df.columns if c in last_price.index]
if len(common) == 0:
raise ValueError("Нет общих тикеров между last_price и fc_returns_df")
fc = fc_returns_df[common].copy()
prev = last_price[common].astype(float).copy()
rows = []
for i in range(len(fc)):
prev = prev * np.exp(fc.iloc[i])
rows.append(prev.copy())
return pd.DataFrame(rows, index=fc.index, columns=fc.columns)
def mc_simulate_from_var(var_res, last_price, returns_train, h, n_sims=500, seed=0):
np.random.seed(seed)
Σ = getattr(var_res, "sigma_u", None)
if Σ is None:
# fallback: cov of residuals
try:
Σ = np.cov(var_res.resid.T)
except:
Σ = np.cov(returns_train.values.T)
mean_fc = var_res.forecast(returns_train.values[-var_res.k_ar:], steps=h)
idx = future_index_from_last(returns_train.index, h)
mean_fc_df = pd.DataFrame(mean_fc, index=idx, columns=returns_train.columns)
k = mean_fc.shape[1]
sims_prices = np.zeros((n_sims, h, k))
last = last_price.values.astype(float)
for s in range(n_sims):
prev = last.copy()
shocks = np.random.multivariate_normal(np.zeros(k), Σ, size=h)
for t in range(h):
r = mean_fc[t] + shocks[t]
prev = prev * np.exp(r)
sims_prices[s,t,:] = prev
mean = sims_prices.mean(axis=0)
low = np.quantile(sims_prices, 0.05, axis=0)
high = np.quantile(sims_prices, 0.95, axis=0)
cols = returns_train.columns
mean_df = pd.DataFrame(mean, index=idx, columns=cols)
low_df = pd.DataFrame(low, index=idx, columns=cols)
high_df = pd.DataFrame(high, index=idx, columns=cols)
return mean_df, low_df, high_df
# ---------- main function: forecast into future for one sector ----------
def forecast_future_all_models(csv_path, sector_name, horizons=[10,20,100], n_sims=500, out_dir="FORECASTS_FUT"):
os.makedirs(out_dir, exist_ok=True)
df_prices = pd.read_csv(csv_path, index_col=0, parse_dates=True).sort_index()
df_prices = df_prices.apply(pd.to_numeric, errors='coerce').dropna(axis=1, how='all')
# returns for VAR/VARMAX training
df_ret = np.log(df_prices).diff().dropna(how='any')
if df_ret.shape[0] < 30:
raise ValueError("Too few obs for modelling.")
last_price = df_prices.iloc[-1]
results = {}
# ---------- VAR: fit on full returns ----------
try:
model_var = VAR(df_ret)
sel = model_var.select_order(8)
chosen = sel.selected_orders.get("aic") or sel.selected_orders.get("bic") or 2
lag = int(chosen)
var_res = model_var.fit(lag)
except Exception as e:
var_res = None
print("VAR fit failed:", e)
# ---------- VECM: fit on levels if cointegration ----------
vecm_res = None
try:
joh = coint_johansen(df_prices, det_order=0, k_ar_diff=1)
r = sum(joh.lr1 > joh.cvt[:,1])
if r > 0 and r < df_prices.shape[1]:
vecm = VECM(df_prices, k_ar_diff=1, coint_rank=r)
vecm_res = vecm.fit()
except Exception as e:
vecm_res = None
# ---------- VARMAX: simple varmax on returns (no exog) ----------
varmax_res = None
try:
vm = VARMAX(df_ret, order=(1,1))
varmax_res = vm.fit(disp=False)
except Exception as e:
varmax_res = None
# For each horizon produce forecasts and MC
for h in horizons:
print(f"Sector {sector_name} — horizon {h}")
idx_future = future_index_from_last(df_prices.index, h)
# VAR forecast (future)
if var_res is not None:
fc_returns = var_res.forecast(df_ret.values[-var_res.k_ar:], steps=h)
fc_returns_df = pd.DataFrame(fc_returns, index=idx_future, columns=df_ret.columns)
fc_prices_var = safe_reconstruct(last_price, fc_returns_df)
mean_var, low_var, high_var = mc_simulate_from_var(var_res, last_price, df_ret, h, n_sims=n_sims)
results.setdefault("VAR", {})[f"h{h}_fc_prices"] = fc_prices_var
results["VAR"][f"h{h}_mean_mc"] = mean_var
results["VAR"][f"h{h}_low_mc"] = low_var
results["VAR"][f"h{h}_high_mc"] = high_var
# save
fc_returns_df.to_csv(os.path.join(out_dir, f"{sector_name}_VAR_returns_h{h}.csv"))
fc_prices_var.to_csv(os.path.join(out_dir, f"{sector_name}_VAR_prices_h{h}.csv"))
# plot per column
for col in fc_prices_var.columns:
plt.figure(figsize=(10,4))
# history tail
hist = df_prices[col].iloc[-250:]
plt.plot(hist.index, hist.values, color="black", label="History")
# mean
plt.plot(mean_var.index, mean_var[col], color="blue", label="VAR MC mean")
# interval
plt.fill_between(mean_var.index, low_var[col], high_var[col], color="blue", alpha=0.2, label="90% interval")
plt.scatter([df_prices.index[-1]], [df_prices[col].iloc[-1]], color="black")
plt.title(f"{sector_name} — {col} — VAR forecast (h={h})")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.savefig(os.path.join(out_dir, f"{sector_name}_{col}_VAR_h{h}.png"))
plt.show()
# VECM forecast (levels) - only if fitted
if vecm_res is not None:
try:
fc_levels = vecm_res.predict(steps=h)
fc_df = pd.DataFrame(fc_levels, index=idx_future, columns=df_prices.columns)
results.setdefault("VECM", {})[f"h{h}_fc_levels"] = fc_df
fc_df.to_csv(os.path.join(out_dir, f"{sector_name}_VECM_levels_h{h}.csv"))
# quick plot (no MC implemented robustly)
for col in fc_df.columns:
plt.figure(figsize=(10,4))
hist = df_prices[col].iloc[-250:]
plt.plot(hist.index, hist.values, color="black", label="History")
plt.plot(fc_df.index, fc_df[col], "--", color="green", label="VECM forecast")
plt.scatter([df_prices.index[-1]], [df_prices[col].iloc[-1]], color="black")
plt.title(f"{sector_name} — {col} — VECM forecast (h={h})")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.savefig(os.path.join(out_dir, f"{sector_name}_{col}_VECM_h{h}.png"))
plt.show()
except Exception as e:
print("VECM forecast failed for horizon", h, e)
# VARMAX forecast (future) - if fitted
if varmax_res is not None:
try:
# need to forecast exog if any; here using no-exog case
fc_ret = varmax_res.get_forecast(steps=h).predicted_mean
fc_ret.index = idx_future
fc_prices_vm = safe_reconstruct(last_price, fc_ret)
results.setdefault("VARMAX", {})[f"h{h}_fc_prices"] = fc_prices_vm
fc_ret.to_csv(os.path.join(out_dir, f"{sector_name}_VARMAX_returns_h{h}.csv"))
fc_prices_vm.to_csv(os.path.join(out_dir, f"{sector_name}_VARMAX_prices_h{h}.csv"))
for col in fc_prices_vm.columns:
plt.figure(figsize=(10,4))
hist = df_prices[col].iloc[-250:]
plt.plot(hist.index, hist.values, color="black", label="History")
plt.plot(fc_prices_vm.index, fc_prices_vm[col], "--", color="purple", label="VARMAX forecast")
plt.scatter([df_prices.index[-1]], [df_prices[col].iloc[-1]], color="black")
plt.title(f"{sector_name} — {col} — VARMAX forecast (h={h})")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.savefig(os.path.join(out_dir, f"{sector_name}_{col}_VARMAX_h{h}.png"))
plt.show()
except Exception as e:
print("VARMAX forecast failed:", e)
return results
for sector, path in SECTORS.items():
res = forecast_future_all_models(path, sector)
Sector rus_fin — horizon 10
Sector rus_fin — horizon 20
Sector rus_fin — horizon 100
Sector rus_oil — horizon 10
Sector rus_oil — horizon 20
Sector rus_oil — horizon 100
Sector rus_met — horizon 10
Sector rus_met — horizon 20
Sector rus_met — horizon 100
Sector us_tech — horizon 10
Sector us_tech — horizon 20
Sector us_tech — horizon 100
Sector us_retail — horizon 10
Sector us_retail — horizon 20
Sector us_retail — horizon 100
Sector us_energy — horizon 10
Sector us_energy — horizon 20
Sector us_energy — horizon 100
Отчет¶
Построены многомерные модели временных рядов: VAR, VARMAX и VECM, выполнены тесты стационарности, коинтеграции и прогнозирование на различные горизонты для 6 секторов:¶
Данные и предварительная обработка¶
Использовались исторические цены Yahoo Finance (1d частота).
В логах точно указано количество загруженных строк, например:
- GAZP.ME: успешно загружен — 6570 строк
- AAPL: успешно загружен — 11320 строк
Некоторые тикеры РФ имеют сильно укороченные ряды (особенно VTBR).
Все данные синхронизированы по датам, пропуски удалены.
Используемые преобразования:
- логарифм цен
- лог-доходности:
$$ r_t = \log P_t - \log P_{t-1} $$ - уровни хранятся отдельно для VECM.
Разведочный анализ данных (EDA)¶
Корреляции внутри секторов¶
🇷🇺 Финансы¶
- Средняя корреляция: 0.764
- Отдельные p-values ADF/KPSS:
- SBER: ADF p-value ≈ $6.27 \times 10^{-15}$, KPSS p ≈ 0.1
- VTBR: ADF p-value ≈ $4.69 \times 10^{-14}$, KPSS p ≈ 0.1
- TCSG: ADF p-value ≈ $1.13 \times 10^{-29}$, KPSS p ≈ 0.0536
🇷🇺 Металлургия¶
- Средняя корреляция: 0.511
- ADF p-values:
- NLMK: $2.13 \times 10^{-26}$
- GMKN: $0.0$
- CHMF: $1.23 \times 10^{-16}$
🇺🇸 Tech¶
- Корреляции доходностей 0.7–0.95
- Высокая синхронность: все тикеры показывают почти идеальный долгосрочный растущий тренд.
Вывод:¶
Американские сектора более устойчивы и согласованы. Российские — более шумные, с выбросами и разрывами рядов.
Стационарность: (ADF / KPSS)¶
На уровнях:¶
Везде ADF p-value > 0.9, KPSS p-value < 0.01 →
уровни нестационарные (I(1)).
На лог-доходностях:¶
По логам:
- SBER: ADF p = 6.27e-15
- VTBR: ADF p = 4.69e-14
- TCSG: ADF p = 1.13e-29
- NLMK: ADF p = 2.13e-26
- GMKN: ADF p = 0
- CHMF: ADF p = 1.23e-16
Все доходности стационарны (ADF p < 0.05), KPSS p ≈ 0.1 $\implies$ принимается стационарность.
Тест на коинтеграцию (Johansen)¶
| Сектор | Johansen trace rank |
|---|---|
| rus_fin | 0 |
| rus_oil | 1 |
| rus_met | 1 |
| us_tech | 1 |
| us_retail | 1 |
| us_energy | 0 |
Интерпретация:¶
- rank = 1 означает наличие долгосрочного равновесия между акциями сектора.
- Это ожидаемо:
- Tech USA работает как единая система
- Нефтегаз РФ связан через цены на нефть
- Металлургия РФ — через сырьевые циклы
- rus_fin и us_energy не коинтегрированы.
VAR: статистический анализ¶
Выбор лагов¶
- rus_fin → lag = 1
- rus_oil → lag = 2
- rus_met → lag = 1
- us_tech → lag = 2
- us_retail → lag = 2
- us_energy → lag = 1
Значимость коэффициентов¶
По summary:
- В США большинство лагов t-stat > 2 → статистически значимы.
- В РФ много коэффициентов незначимо (нестабильные рынки и короткие ряды).
Стабильность¶
Все VAR модели стабильны:
корни companion matrix < 1.0
(например SBER roots: 0.87, 0.72, 0.69).
VARMAX¶
- VARMAX сходится устойчивее VAR.
- MA-компонента снижает шум:
rolling_RMSE_mean ≈ 0.008–0.012 по секторам США,
≈ 0.015–0.03 по РФ - Иногда чрезмерно сглаживает прогноз (визуальная «линия»).
VARMAX почти всегда улучшает MAE по сравнению с VAR.
VECM¶
VECM применяется, когда Johansen rank > 0.
- rus_oil: скорость приведения α ≈ -0.25 … -0.4 (значимо)
- rus_met: α ≈ -0.15 … -0.3
- us_tech: α ≈ -0.1 … -0.2
Экономический смысл:¶
- Сильные отрицательные α означают более быстрый возврат к равновесию → устойчивые 100-дневные прогнозы.
Диагностика моделей¶
Ljung–Box p-values:¶
- США ↦ p ≈ 0.20–0.35 → нет автокорреляции
- РФ ↦ p ≈ 0.01–0.05 → автокорреляция частично присутствует
ARCH тест:¶
- США: ARCH p ≈ 0.10–0.30 → слабая гетероскедастичность
- РФ: ARCH p ≈ 0.01–0.07 → значимая гетероскедастичность
Нормальность остатков (Anderson–Darling):¶
- p-values < 0.05 почти везде (финансовые ряды редко нормальны).
Вывод:¶
VAR/VECM корректны по автокорреляции для США, но РФ имеет остаточную зависимость и ARCH эффект(условная дисперсия), что ухудшает прогноз.
Прогнозирование (10 / 20 / 100 дней)¶
10-дневный горизонт¶
- VARMAX лучший по RMSE в 5 из 6 секторов
- ошибку можно охарактеризовать как низкую (RMSE ≈ 0.005–0.015)
20-дневный горизонт¶
- VECM начинает выигрывать там, где rank > 0
- VAR начинает проигрывать по волатильности
100-дневный горизонт¶
- VAR и VARMAX становятся чрезмерно сглаженными (почти линейная траектория)
- VECM → единственная модель, показывающая экономически осмысленное поведение (возврат к коинтеграции)
Сравнение по RMSE и MAE¶
Итоги по всем секторам:¶
| Сектор | Лучшая модель | Причина |
|---|---|---|
| rus_fin | VARMAX | отсутствует коинтеграция, короткие ряды |
| rus_oil | VECM | rank = 1, сильная долгосрочная связь |
| rus_met | VAR/VECM (оба) | волатильность, слабее связь |
| us_tech | VARMAX | сильная краткосрочная MA-структура |
| us_retail | VECM | долгосрочная связь сетей дистрибуции |
| us_energy | VARMAX | нет коинтеграции |
Основные выводы¶
- Доходности стационарны, уровни — нет (ADF/KPSS подтверждают).
- Коинтеграция найдена в 4 из 6 секторов, что делает VECM оправданной.
- VARMAX лучший на краткосрочных горизонтах (10–20 дней), особенно в США.
- VECM лучший на 100-дневном горизонте, т.к. корректно учитывает долгосрочные соотношения.
- Российские ряды страдают от:
- разрывов данных
- скачков волатильности
- структурных шоков
- Американские ряды:
- стабильные
- хорошо моделируются
- дают узкие Monte-Carlo интервалы
- ARCH эффект присутствует почти во всех секторах РФ → важно отмечать в отчёте.
- Прогнозы на 100 дней у VAR/VECM часто сглажены — это не ошибка, а особенность линейных моделей.
Приложения¶
1. Средние корреляции:¶
- rus_fin: 0.764
- rus_met: 0.511
- us_tech: 0.878 (по средним значениям матрицы)
2. Примеры ADF/KPSS значений:¶
- CHMF: ADF p = 1.23e-16
- TCSG: ADF p = 1.13e-29
- GMKN: ADF p = 0, KPSS p = 0.1
3. Johansen rank:¶
- (1) — rus_oil, rus_met, us_tech, us_retail
- (0) — rus_fin, us_energy
Таким образом, проведённое исследование подтверждает, что корректный выбор модели с учётом свойств данных (стационарности, коинтеграции, гетероскедастичности и структуры лагов) является ключом к получению статистически обоснованных и экономически интерпретируемых прогнозов, а комплексное использование VAR, VARMAX и VECM формирует устойчивую методологию анализа многомерных финансовых временных рядов.